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Abstract 

Dodis and Wichs [DW09] introduced the notion of a non-malleable extractor to study the 
problem of privacy amplification with an active adversary. A non-malleable extractor is a much 
stronger version of a strong extractor. Given a weakly-random string x and a uniformly random 
seed y as the inputs, the non-malleable extractor nmExt has the property that nmExt(a;,y) 
appears uniform even given y as well as nmExt(a;, A{y)), for an arbitrary function A with A{y) ^ 
y. Dodis and Wichs showed that such an object can be used to give optimal privacy amplification 
protocols with an active adversary. 

Previously, there are only two known constructions of non-malleable extractors [DLWZll, 
CRS12]. Both constructions only work for (rt, A;)-sources with k > n/2. Interestingly, both 
constructions arc also two-source extractors. 

In this paper, we present a strong connection between non-malleable extractors and two- 
source extractors. The first part of the connection shows that non-malleable extractors can be 
used to construct two-source extractors. If the non-malleable extractor works for small min- 
entropy and has a short seed length with respect to the error, then the resulted two-source 
extractor beats the best known construction of two-source extractors. This partially explains 
why previous constructions of non-malleable extractors only work for sources with entropy rate 
> 1/2, and why explicit non-malleable extractors for small min-entropy may be hard to get. 

The second part of the connection shows that certain two-source extractors can be used 
to construct non-malleable extractors. Using this connection, we obtain the first construction 
of non-malleable extractors for k < n/2. Specifically, we give an unconditional construction 
for min-entropy k = (1/2 — 5)n for some constant (5 > 0, and a conditional (semi-explicit) 
construction that can potentially achieve k = an for any constant a > 0. 

We also generalize non-malleable extractors to the case where there are more than one ad- 
versarial seeds, and show a similar connection between the generalized non-malleable extractors 
and two-source extractors. 

Finally, despite the lack of explicit non-malleable extractors for arbitrarily linear entropy, we 
give the first 2-round privacy amplification protocol with asymptotically optimal entropy loss 
and communication complexity for (ti, k) sources with k = an for any constant a > 0. This 
dramatically improves previous results and answers an open problem in [DLWZll]. 
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Simons postdoctoral fellowship. 



1 Introduction 



The broad area of randomness extraction studies the problem of convertmg a weakly random somxe 
into a distribution that is close to the uniform distribution in statistical distance. Over the past 
decades extensive research has been conducted in this area. Among which, a long line of research 
([SZ99, TreOl, RRV02, LRVW03, GUV09, DW08, DKSS09] to name a few) studies the so called 
"seeded extractors", as defined by Nisan and Zuckerman [NZ96]. Besides its original motivation 
in computing with imperfect random sources, seeded extractors have found applications in coding 
theory, cryptography, complexity and many other areas. We refer the reader to [FS02, Vad02, ?] 
for a survey on this subject. Nowadays we have nearly optimal constructions of seeded extractors 
[LRVW03, GUV09, DW08, DKSS09]. 

Another line of research focuses on the problem of extracting random bits from several inde- 
pendent sources [CG88, BIW04, BKS+05, Raz05, Bou05, Rao06, BRSW06, Lill]. In this case, 
however, the best known construction is far from optimal. Specifically, the probabilistic method 
shows that there exists an extractor for two independent sources on n bits with each having roughly 
log n bits of entropy, while the best two-source extractor to date can only achieve entropy slightly 
below n/2 [Bou05]. The best known extractor for small entropy k requires 0(logn/ log A;) inde- 
pendent sources [Rao06, BRSW06]. Moreover, it seems hard to improve these results. Especially 
in the two-source case, after decades of efforts the entropy requirement only drops from anything 
above n/2 [CG88] to shghtly below n/2 [Bou05]. 

Recently, a new kind of seeded extractors, called non-malleable extractors were introduced in 
[DW09] to give protocols for the problem of privacy amplification with an active adversary. We now 
give the definition of a non-malleable extractor below. As a comparison, we also give the definition 
of a strong seeded extractor. 

Notation. We let [s] denote the set {1, 2, . . . , s}. For i a positive integer, Ue denotes the uniform 
distribution on {0, 1}^, and for S a set. Us denotes the uniform distribution on S. When used as 
a component in a vector, each Ui or Us is assumed independent of the other components. We say 
W Z '\i the random variables W and Z have distributions which are e-close in variation distance. 

Definition 1.1. The min-entropy of a random variable X is 

H^iX)= min log2(l/Pr[X = x]). 

xGsupp(J(') 

For X G {0, l}'^, we call X an {n, HoQ{X))-source, and we say X has entropy rate H^{X) /n. We 
say X is a flat source if it is the uniform distribution over some subset S C {0, 1}". 

Definition 1.2. A function Ext : {0, 1}" x {0, l}*^ — {0, 1}™ is a strong {k^e)- extractor if for every 
source X with min-entropy k and independent Y which is uniform on {0, 1}'^, 

(Ext(x,y),y) {Um^Y). 

Definition 1.3. ^ A function nmExt : {0, 1}" x {0, 1}"' {0, 1}™ is a (fc, e) -non-malleable extractor 
if, for any source X with Hao{X) > k and any function A : {0, l}*^ — >■ {0, 1}*^ such that A{y) / y 
for all y, the following holds. When Y is chosen uniformly from {0, 1}"^ and independent of X, 

(nmExt(X, Y), nmExt(X, A{Y)),Y) [Urn, nmExt(X, A{Y)),Y). 

^Following [DLWZll], we define worst case non-malleable extractors, which is slightly different from the original 
definition of average case non- malleable extractors in [DW09] . However, the two definitions are essentially equivalent 
up to a small change of parameters. 
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As we can see from the definitions, a non-malleable extractor is a stronger version of the strong 
extractor, in the sense that it requires the output to be close to uniform even conditioned on both 
the seed Y and the output nmExt(X, ^(y)) on a different but arbitrarily correlated seed A{Y). 

The motivation to study a non-malleable extractor, the privacy amplification problem, is a fun- 
damental problem in symmetric cryptography that has been studied by many researchers. Bennett, 
Brassard, and Robert introduced this problem in [BBR88]. The basic setting is that, two parties 
(Alice and Bob) share an n-bit secret key X, which is weakly random. This could happen because 
the secret comes from a password or biometric data, which are themselves weakly random, or be- 
cause an adversary Eve managed to learn some partial information about an originally uniform 
secret, for example via side channel attacks. We measure the entropy of X by the min-entropy 
defined above. The goal is to have Alice and Bob communicate over a public channel so that they 
can convert X into a nearly uniform secret key. Generally, we also assume that Alice and Bob have 
local private uniform random bits. The problem is the presence of the adversary Eve, who can see 
every message transmitted in the channel and may or may not change the messages. We assume 
that Eve has unlimited computational power. 

The case where Eve is passive, i.e., cannot change the messages, can be solved simply by using 
the above mentioned strong seeded extractors. The case where Eve is active (i.e., can change the 
messages in arbitrary ways), on the other hand, is much more difficult. Historically, Maurer and 
Wolf [MW97] gave the first non-trivial protocol in this case. Their protocol takes one round and 
works when the entropy rate of the weakly-random secret X is bigger than 2/3. Dodis, Katz, 
Reyzin, and Smith [DKRS06] later improved this result to give protocols that work for entropy 
rate bigger than 1/2. One drawback in both cases is that the final secret key R is much shorter 
than the min-entropy of X. Later, Dodis and Wichs [DW09] showed that no one-round protocol 
exists for entropy rate less than 1/2. The first protocol that breaks the 1/2 entropy rate barrier is 
due to Renner and Wolf [RW03] , where they gave a protocol that works for essentially any entropy 
rate. However their protocol takes 0{s) rounds and only achieves entropy loss O(s^), where s in 
the security parameter of the protocol. Kanukurthi and Reyzin [KR09] simplified their protocol, 
but the parameters remain essentially the same. 

In [DW09], Dodis and Wichs showed that explicit non-malleable extractors can be used to give 
privacy amplification protocols that take an optimal 2 rounds and achieve optimal entropy loss 
0{s). They showed that non-malleable extractors exist when k > 2m + 31og(l/e) + logd + 9 and 
d > log(n — k + 1) + 2 log(l/e) + 7. However, they only constructed weaker forms of non-malleable 
extractors and they gave a protocol that takes 2 rounds but that still has entropy loss O(s^). 
Chandran, Kanukurthi, Ostrovsky and Reyzin [CKORIO] improved the entropy loss to 0{s) but 
the number of rounds becomes 0(s) as well. 

Dodis, Li, Wooley and Zuckerman [DLWZll] constructed the first explicit non-malleable ex- 
tractor. Their construction works for entropy k > n/2, but they use a large seed length d = n and 
the efficiency when outputting more than logn bits relies on an unproven assumption. Cohen, Raz, 
and Segev [CRS12] later gave an alternative construction that also works for k > n/2, but uses 
a short seed length and does not rely on any unproven assumption. The construction in [CRS12] 
also allows multiple adversarial functions {.4^}. By using the non- malleable extractors, these two 
papers thus gave 2-round privacy amplification protocols that achieve optimal entropy loss 0(s). 
However, since both constructions of non-malleable extractors are only shown to work for entropy 
k > n/2^ the protocols also only work for k > n/2. For any constant 6 > 0, [DLWZll] also 

^We remark that the 1-bit case construction in [DLWZll] is a special case of the construction in [CRS12]. Also, 
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gave a protocol for k = Sn than runs in poly(l/(5) rounds and achieves optimal entropy loss 0{s). 
Recently, Li [Lil2] introduced the notion of a non-malleable condenser, which is a relaxation of a 
non-malleable extractor. He showed that non-malleable condensers for (n, k) sources also give pri- 
vacy amplification protocols that take an optimal 2 rounds and achieve optimal entropy loss 0{s). 
However, the non- malleable condensers constructed in [Lil2] also only work for k > n/2. Thus 
the natural open question is whether we can construct non-malleable extractors or condensers for 
smaller min-entropy, and whether there are 2-round privacy amplification protocols with optimal 
entropy loss for smaller min-entropy. 

One interesting aspect of the two known constructions of non-malleable extractors is that they 
are also both two-source extractors. Indeed, the construction in [DLWZll] is in fact one of the two- 
source extractors introduced in [CG88], which requires the sources to have min-entropy > n/2, and 
the construction in [CRS12] is in fact the two-source extractor in [Raz05], which requires at least 
on of the sources to have min-entropy > n/2. Coincidently, when used as non-malleable extractors, 
both of these constructions also require the weak source to have min-entropy > n/2. These facts 
suggest that, despite the fact that these two kinds of extractors seem quite different, there may be 
some connections between them. However, before this work, no such connection is known. 

1.1 Our results 

In this paper, we present a strong connection between non-malleable extractors and two-source 
extractors. First, we show that non-malleable extractors can be used to construct two-source 
extractors. If the non- malleable extractor works for small min-entropy and has a short seed length 
(w.r.t. log(l/e) where e is the error of the extractor), then the resulted two-source extractor beats 
the best known construction of two-source extractors. 

Theorem 1.4. Assume that for any e > 0, we have explicit constructions of {k,e) -non-malleable 
extractors with seed length d = 21og(l/e) + o{n) and output length m. Then there exists a constant 
5 > and an explicit construction of two source extractors that take as input an (n, (1/2 — 5)n) 
source and an independent {n,k) source, and output m bits with error 2~^^"'\ 

Note that if k is small, say k = n/3 then this already beats the best known two-source extractors, 
but better results can be achieved if we have explicit constructions of generalized non-malleable 
extractors. We have the following definition (which already appears in [CRS12]). 

Definition 1.5. A function nmExt : {0, 1}" x {0, l}'^ — )• {0, 1}™ is a (r, k, e)-non-malleable extractor 
if, for any source X with H^{X) > k and any r function Ai : {0, l}'^ — t- {0, l}'^,i = 1, • • • ,r such 
that Ai{y) 7^ y for all i and y, the following holds. When Y is chosen uniformly from {0, l}'^ and 
independent of X, 

{nmExt{X,Y),{nmExt{X,A{Y))},Y) ([/,„, {nmExt(X, A(>^))}, 1^). 

Here r is the number of adversarial seeds. Note that traditional non-malleable extractors are 
just (1, A;, e)-non-malleable extractors according to our definition. In Appendix A we show that 
for any constant r, (r, /c, e)-non-malleable extractors exist with seed length d > |log(n — A;) -|- 
31og(l/e) -|- 0(1). Now we have the following theorem. 

it is possible that the construction in [DLWZll] can work for entropy k < n/2 (but until now nobody can prove it), 
but the construction in [CRS12] in general cannot work for entropy k < n/2. 
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Theorem 1.6. For any constant b > 2 and any constant < 5 < 1, there exists a constant 
C = C{5) = poly(l/(5) such that the following holds. Assume that for any e > there exists an 
explicit construction of {C,k,e) -non-malleable extractors with seed length d = 61og(l/e) + o(n) and 
output length m. Then there exists an explicit construction of two source extractors that take as 
input an {n,6n) source and an independent {n,k) source, and output m bits with error 2""^("). 

Note that if we have a (C, k, e)-non-malleable extractor for k = 6n and some constant C = 
C{5) = poly (1/5) then this wiU give us a two-source extractor for {n,5n) sources. If 5 is small 
this will be a big breakthrough for two-source extractors. This also implies that, given current 
techniques, the (r, k, e)-non-malleable extractor in [CRS12] is probably the best that we can achieve. 

Next, we show that in the opposite direction, certain two-source extractors can be used to 
construct non-malleable extractors. The two-source extractors we will use are those that are con- 
structed based on the inner product function. More specifically, we will consider two-source ex- 
tractors of the form TExt = \P{f{X), Y), where IP is the inner product function over F2 and f{X) 
stands for some function (encoding) of the source X. We have the following theorem. 

Theorem 1.7. Given two integers r,i such that i > r. Assume that we have a two-source extractor 
TExt = IP(/(X), W) such that when given an (n, k)-source X and an independent (n2, n2/(r -|- 1) — 
i)-source W , TExt outputs 1 bit with error e. Then there exists an explicit construction of (r, k, e')- 
non-malleable extractors that output 1 bit with error e' = 0(r2^-^ + 2f e). 

Using this theorem, and by combining known two-source extractors, we obtain new and im- 
proved constructions of non- malleable extractors. We give the first explicit constructions of non- 
malleable extractors that work for min-entropy k < n/2. One of them is unconditional and works 
for k = (1/2 — 6)n for some universal constant 5 > 0. The other is conditional but can potentially 
work for k = 5n for any constant 5 > 0. Specifically, we have the following theorems. 

Theorem 1.8. There exists a constant < 5 < 1 and an explicit {k,e) -non-malleable extractor 
nmExt : {0, 1}" x {0, 1}" {0, 1}™ with k = (1/2 - 5)n, m = n{n) and e = 2-^("). 

Our conditional result needs to use an affine extractor and an assumption from additive com- 
binatorics, as used in [BSZll]. Thus we first define affine extractors and state the assumption. 

Definition 1.9. An [n,m,p,e] affine extractor is a deterministic function / : {0,1}"' — )■ {0,1}"^ 
such that whenever X is the uniform distribution over some affine subspace over with dimension 
pn, we have that for every z G {0, 1}™, 

|Pr[/(X) = z]- 2-'"| < e. 

Note that we bound the error by the norm instead of the traditional norm, as in [BSZll]. 
We will let A denote the entropy loss rate, i.e., A = 1 — We note that it is straightforward to 
show by the probabilistic method that such extractors exist for any constant p, A > 0. However the 
state of art constructions only achieve A bigger than 1/2. 

[BSZll] also introduced the Approximate Duality conjecture (ADC), which basically says that 
if two independent sources X, Y with linear entropy are such that IP(X, Y) is not close to uniform, 
then there exist two subsources X' G X,Y' C Y with small deficiency such that \P{X',Y') is 
constant. In [BSZll] it is shown that ADC is implied by the well-known Polynomial Preiman- 
Ruzsa Conjecture in additive combinatorics. For a formal definition, see Section 6.3. We now have 
the following theorems. 
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Theorem 1.10. Assume the ADC conjecture and we have an explicit [n, m, |, 2 affine extractor 
with m = (1 — A)|n. Then there exists a semi-explicit {k,e) -non-malleable extractor nmExt : 
{0, 1}" X {0, l}*^ ^ {0, l}"" with k = j^n, d = j^n, m = fi(n) and e = 2"^^^. 

Theorem 1.11. Given a constant integer r, assume the ADC conjecture and we have an explicit 
[n,m, ^q^,2~™] affine extractor with m = (1 — \)^^^n. Then there exists a semi-explicit (r,k,e)- 

non-malleable extractor with k = j^^jji^^n, seed length d = ^qrj-qqf^rjjT]^?^ — 1 and e = 2"^*^"^. 

Remark 1.12. Here we use the ''^semi-explicit''^ to mean that the construction may run in time 2"'. 
It is semi-exphcit in the sense that the running time is polynomial in the length of the extractor's 
truth table (note that an exhaustive search takes time 2^"). If we have affine extractors with large 
output size such that A — ?• 0, then we can essentially achieve k = an for any constant q > 0. 

Finally, we give a new privacy amplification protocol for min-entropy k = 5n for any constant 
5 > 0. Although we don't have explicit non-malleable extractors or condensers for such small k, 
our protocol simultaneously achieves optimal round complexity (2 rounds), asymptotically optimal 
entropy loss and asymptotically optimal communication complexity. This is the first optimal privacy 
amplification protocol for arbitrarily linear min-entropy. We have the following theorem. 

Theorem 1.13. For any constant Q < 5 < \ there exists a constant < /3 < 1 such that as long as 
s < /3n, there is an efficient 2-round privacy amplification protocol for any (n, 5n) weak secret X 
with security parameter s, entropy loss 0{s -\- logn) and communication complexity 0{s + logn). 

Thus, in the case where k = 5n, our result dramatically improves all previous results. Especially, 
it improves the round complexity of the protocol in [DLWZll] from poly(l/(5) to 2, and thus answers 
an open problem in [DLWZll]. 

2 Overview of The Constructions and Techniques 

In this section we give an overview of our constructions and the techniques used. In order to give 
a clean description, we shall be informal and imprecise sometimes. 

2.1 From non- malleable extractors to two-source extractors 

Given a {k,e) non-malleable extractor nmExt with seed length d = 21og(l/e) -|- o(n), here is how 
we can get a two-source extractor. Assume that we have an (n, k) source X and an independent 
(n, (1/2 — 5)n) source Y for some constant 6 > 0. Our first step is to use the 1-bit condenser in 
[Zuc07] to convert Y into two sources Yi,Y2 such that each of them has / = f2(n) bits and one of 
them has min-entropy at least (1/2 + 6)1. Note that for an appropriately chosen 5 this is indeed 
possible. Without loss of generality assume that Yi has min-entropy at least (1/2 -|- 6)1. 

Our key observation here is that Y2 can now be viewed as a function of Yi. More precisely, 
we show that the source y is a convex combination of sources {Y^} such that for each Y^, the 
corresponding Y^ also has min-entropy at least (1/2 -\- 6)1, and Y2 is a deterministic function of Y^. 
Now this looks like the setting of a non-malleable extractor, where we have one seed and another 
correlated seed. However, there is a small problem: Y^ and 1^* may be equal sometimes. To solve 
this, we let Yi = Yl o and Y2 = Y20I. In this way we guarantee that Y^ and Y2 are different, and 
Y2 is still a function of Yf. Finally, this only increases the length of the seed by 1. 
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Now we are all set, and we can take the two-source extractor to be TExt(X, Y) = nmExt(X, Yi)© 
nmExt(X, 12)- Note that the seed Yi here is not uniform. However, a simple argument shows that 
a non-malleable extractor with seed length d and error e remains a non-malleable extractor even 
if the seed only has min-entropy k', with error increased to 2'^~^ e. In our case, with seed length 
d = l + l = n{n) = 21og(l/e) + o(n) and k' = (1/2 + 5)1, the error is e' = 2'^-''' e « 2(V2-5)/2-V2 = 
2-f^(n)_ gy ^YiQ non-malleability of nmExt, we get that TExt{X,Y) is 2~^(")-close to uniform. 

We note that given any strong extractor with error e and seed length d, it remains an extractor 
even if the seed only has min-entropy k', with error increased to 2'^~^ e. However, since the seed 
length is at least d = log(n — k) + 2 log(l/e) — 0(1), this will never be able to get k' below d/2. On 
the other hand, if the extractor is non-malleable as in our case, then it does allow us to break the 
entropy rate 1/2 barrier, and get a two-source extractor for an (n, (1/2 — 6)n) source and an (n, k) 
source. This shows that non-malleability is a highly non-trivial property of a seeded extractor. 

Similarly, if we have (r, k, e)-non-malleable extractors for larger r, then we can afford to have 
more correlated seeds Yi, or equivalently, more sources in the output of the condenser. Thus we 
can deal with smaller entropy in Y. For example, for any constant 6 > 0, condensers based on sum- 
product theorems [BKS'''05, Raz05, Zuc07] allow us to convert an {n,dn) source into a constant D 
number of sources such that each of them has I = 0,{n) bits and one of them has min-entropy at 
least 0.9/. If we have {D — 1, A;, e)-non-malleable extractors with suitable parameters, then we can 
get two-source extractors or an (n, 5n) source and an (n, k) source. 

2.2 Prom two-source extractors to non- malleable extractors 

As stated before, we focus on two-source extractors of the form \P{f(X), Y), where IP is the inner 
product function. First consider the simplest function \P{X,Y). Note that it is a good two-source 
extractor. For two independent sources on n bits, it works as long as the sum of the entropies of the 
two sources is greater than n. However, at first this function does not seem to be a good candidate 
for a non-malleable extractor. To see this, consider the inner product function over F2. Let X be a 
source that is obtained by concatenating the bit with Un^i, and let Y be an independent uniform 
seed over {0, 1}". Now for any y G {0, 1}", let A{y) be y with the first bit flipped. Thus we see 
that for all x in the support of X, one has {x, y) = {x, A{y)). Therefore, the inner product function 
is not a non-malleable extractor even for weak sources with min-entropy k = n — 1. 

In the above example, we have that for all x in the support of X, \P{x,y) = \P{x,A{y)). Or 
equivalently, \P{x,y + A{y)) = 0. How does this happen? Looking closely at this example, our 
key observation is that this is because the range of Y is too large. Indeed, in this example the 
range of Y is the entire {0, 1}", thus for any y the adversary can choose a different A{y) such that 
y + A{y) = 10 • • • so that Vx G Supp(X), \P{x, y + A{y)) = 0. 

This observation suggests that we should choose the range of y to be a subset S C {0, 1}"", so 
that for some y's, the adversary will be unable to choose the appropriate A{y) from S. Equivalently, 
we take a shorter seed length I, choose a uniform y G {0, 1}' and map y to an element in {0, 1}". 
This is essentially an encoding. Now let us see what properties we need the encoding to have. 

We start with a construction for min-entropy k > n/2. Assume that we have an {n,k) source 
X with k = (1/2 + 6)n for some constant S > 0. We take an independent and uniform y G {0, 1}' 
and encode y to y G {0, 1}*^. For any function A, let y' be the encoding of A{y). We will use an 
injective encoding, so that 'iy,y' 7^ y. The output of the non-malleable extractor is then \P[X,Y). 

To show that \P{X,Y) is a non-malleable extractor, it suffices to show that \P{X,Y) is close 
to uniform, and that IP(X, y) © IP(X, y') is close to uniform. The first part is easy. If X has 
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min-entropy k > n/2, then we can take Y to be the uniform distribution over some I > n/2 bits. 
Since the encoding is injective, Y will have min-entropy I > n/2. Thus IP(X, Y) is close to uniform. 
For the second part, note that \P{X,Y) \P{X,Y') = \P{X,Y + Y'). Thus now we need Y + Y' 
to have large min-entropy. Indeed, in the above counterexample where I = n, the adversary can 
choose A such that y -|- y' is always equal to 10 • • • and thus has entropy 0. Now when we take 
/ < n and map {0, 1}' to 5 C {0, 1}"", we want y + y' to have a large support size. 

The ideal case would be that Y + Y' also has support size IS"! =2K This can be achieved if the 
encoding has the following property: for every two different yi, 2/2) we have that yi + y'i ^ y2 + y'2-, 
equivalently, yi + y'^ -|- y2 + 2/2 7^ 0- Indeed, if this is true then Y + Y' also has min-entropy / > n/2, 
and thus IP(X, y) © \P{X,Y') is close to uniform. Looking carefully at this property, we see that 
it can be ensured (at least almost ensured, as we will explain shortly) if we have another property: 
the elements in S (when viewed as vectors in F2) are 4-wise linearly independent. Indeed, assume 
that the elements in S are 4-wise linearly independent. Then if yi + + 7/2 + ^2 ~ 0' only 
possible situation is that y'^ = y2 and 2/2 = Vi- Thus there cannot be three different 2/1,^2, ?/3 such 
that Vi + y'l = y2 + y2 = y?, + v':i- Thus the min-entropy of y + y' is at least I — 1. 

So now the question is to explicitly find a large subset S C {0, 1}" such that the elements in 
S are 4-wise linearly independent. Note that in particular this implies that the sum of any two 
different pairs of elements in S cannot be the same. Thus we have (If I) < 2". Therefore |5| can be 
at most roughly 2"/^. On the other hand, in order to work for any min-entropy k > n/2, we will 
need / > n/2 and thus |5| = 2' > 2"/^. These are very tight upper and lower bounds. Luckily, we 
have explicit constructions that meet these bounds. We will think of the elements in S as columns 
in a parity check matrix of some binary linear code. Thus we basically need a code with block length 
2"' and message length 2'"/2 - n. The 4-wise linearly independent property basically is equivalent 
to saying that the code has distance at least 5. This is precisely the [2"/^, 2"/^ — n,5]-BCH code. 
Note that although the parity check matrix has 2"/^ columns, each column is (a, a^) for a different 
element a G Fg^/a* Thus the encoding from y to y can be computed efficiently. 

Once we have the encoding, we can choose I = n/2 and we know that Y has min-entropy I 
and y + y' has min-entropy / — 1. Now it is straightforward to show that both \P{X,Y) and 
IP(X, y + y') are close to uniform. Thus we obtain a non-malleable extractor for entropy k > n/2. 

Thinking about the above encoding for a moment, one realizes that the same encoding can be 
used in any two-source extractor of the form \P{f{X),Y). Specifically, assume that IP(/(X),y) 
is a two-source extractor for an {n,k) source X and an independent (n, n/2 — 1) source Y. Then 
by the same argument above, if we choose the seed Y to be the uniform distribution over {0, 1}"/^ 
and encode y to y like before, we will have that both Y and Y + Y' have min-entropy at least 
n/2 - 1. Thus both IP(/(X),y) and IP(/(X),y) © \P{f{X),Y') are close to uniform. Therefore 
we get a non-malleable extractor for min-entropy k. 

Similarly, if we have a two-source extractor \P(f{X),Y) for an (n, k) source X and an indepen- 
dent {n,k') source Y with k' ~ n/{r + 1), then we can use a BCH code with distance 2r + 3 to 
construct a (r, fe, e)-non-malleable extractor. We choose the seed Y to be the uniform distribution 
over {0,1}"/^''"'"^^ and encode y to y using the parity check matrix, i.e., Y = (y, y^,--- ,y^^'+^) 
when y is viewed as an element in F*„/(^^i). Since the columns of the parity check matrix are 
2(r + l)-wise linearly independent, we can show that for any subset 5" C [r], F © ©jg5-^t(^) 
has min-entropy roughly n/(r + 1). Thus IP(/(X), Y) © ©jg5 \P {f {X) ., AiiY)) is close to uniform. 
Therefore we get a (r. A;, e)-non-malleable extractor. 
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2.3 Non-malleable extractors for min-entropy k < n/2 

We give the first construction of non-malleable extractors for min-entropy k < n/2 by observing 
that the encoding of sources in [Bou05] gives a function / such that \P{f{X),Y) is a two-source 
extractor for an (n, (1/2 — 6)n) source X and an independent (n, k') source Y with k' ~ n/2. 

Specifically, let X be a distribution over some vector space and let cX be the distribution 
obtained by sampling xi,X2,-" i^c from c independent copies of X and computing X^Xj. By 
Fourier analysis and the Cauchy-Schwarz inequality one can show that in order to prove \P{X,Y) 
is close to uniform, it suffices to prove that IP(cX, y) is close to uniform with a smaller error, for 
some integer c > 1. In [Bou05], Bourgain showed that for a weak source X with min-entropy rate 
1/2 — (5 for some constant 5 > 0, one can encode X to Enc(X) such that 3Enc(X) is close to having 
min-entropy rate 1/2 -|- 5. Thus IP(Enc(X),y) is a two-source extractor that meets our needs. 
Therefore we obtain our non-malleable extractors for min-entropy k = (1/2 — 5)n. 

2.4 Non-malleable extractors for any constant min-entropy rate 

In [BSZll], Ben-Sasson and Zewi showed that affine extractors with large output size can be used 
to construct two source extractors for min-entropy rate < 1/2. Their "preimage construction" can 
potentially achieve any constant min-entropy rate. We observe that their encoding gives a function 
/ such that \P{f{X),Y) is a two-source extractor for two independent sources with min-entropy 
rate 5 for any constant 5 > 0. Specifically, they showed that if we have an affine extractor with 
large output size, then there is an injective mapping F : {0, 1}" — )• {0, 1}" that maps {0, 1}" into 
the preimage of a certain output of the affine extractor, such that for any weak source X with min- 
entropy (^n, F(Supp(X)) is not contained in any affine subspace of dimension say (1 — 6/2)n' . Thus 
when y is a (n', 6n') source, we have that \P{F{X), Y) is non-constant. Next, similar as in [BSZll], 
the ADC conjecture implies that in fact \P{F{X),Y) is close to uniform. Thus \P{F{X),Y) is a 
two-source extractor that meets our needs. Therefore we obtain a non-malleable extractor (and 
even a (r. A;, e)-non-malleable extractor) for min-entropy k = 5n. 

2.5 Increasing output size 

We can also increase the output size to VL{n) for all our constructions with 1 bit output. To do 
this, note that we encode the seed Y by using the columns of a parity check matrix of a BCH 
code. Equivalently, the encoding is that Y = {Y,Y^) when we use a field with / = G(n) 
and y is viewed as an element in F*,. Now treat ¥21 as the vector space F2 and take / elements 
61, • • • , 6; € that corresponds to a basis of Fg. For each bi we define one bit = IP(/(X), biY). 

We then show that {Zi} satisfy the conditions of a non-uniform XOR lemma. Lemma 3.13. 
Specifically, let Z[ = \P{f{X),hiY') where Y' = A{Y). For any non-empty subset C [/] and any 
subset 5*2 C [/], by the linearity of the inner product function, the xor of ZiS where i £ Si and 
Zj's where j £ S2 is of the form \P{f{X),tiY + t2Y'), with ti,t2 S F2i. Since 5*1 is non-empty we 
have ti 7^ 0. We then show that tiY -\- t2Y' roughly has the same min-entropy as Y (at least the 
min-entropy of Y minus log 3). Thus \P{f{X),tiY + t2Y') is close to uniform. We further show 
that the error is 2~^("). Thus by Lemma 3.13 we can output m = Q{n) bits with error 2~^("\ 
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2.6 Reducing seed length 

In all the constructions where we encode the seed y by a parity check matrix, the seed length 
is linear in the source length. However the error is also 2-^("). If we only need to achieve a 
bigger error, we can reduce the seed length by using the parity check matrix of a BCH code 
with larger distance. Specifically, when the distance is 2t + 1 the seed length is roughly n/t. 
However we need to guarantee something else. For example, in the construction for min-entropy 
k > n/2, we need to show that both IP(X, Y) and IP(^, {Y + Y')) are still close to uniform. This 
can be shown as follows. Since now the columns of the parity check matrix are 2t-wise linearly 
independent, both ^Y and ^{Y + Y') will now have min-entropy roughly ^Hoo{Y) = n/2. Thus 
we can conclude that both IP(X, ^Y) and \P{X, ^{Y + Y')) are close to uniform, and therefore 
both \P{X,Y) and \P{X,{Y + Y')) are also close to uniform, by the Cauchy-Schwarz inequality. 
However the error increases according to the seed length. Calculations show that we can get seed 
length d = 0(log n + log(l/e)). 

2.7 An optimal privacy amplification protocol for k = Sn 

In [DLWZll], the authors give a privacy amplification protocol for k = 6n with C = poly (1/5) 
rounds and entropy loss poly(l/5)s, where s is the security parameter. Here we want to somehow 
"compress" the protocol into 2 rounds while still keeping the entropy loss to be 0{s). As in 
[DLWZll], we first use the condenser in [BKS~''05, Raz05, Zuc07] to convert the shared {n,k) 
source X into a somewhere rate-0.9 source {Xi, ■ ■ ■ ,Xc) with C = poly (1/5) rows. Now the high- 
level idea of the protocol is as follows. In the first round, Alice samples a fresh random string Yi 
from her private random bits and sends it to Bob, where Bob receives a possibly modified version 
Y(. In the second round. Bob samples a fresh random string W' from his private random bits and 
tries to send it to Alice, where Alice receives a possibly modified version W. We want a protocol 
such that if Eve does not change Yi , then with high probability Bob can authenticate W' to Alice 
and they can both output Ext{X, W) as the final outputs, by using a strong seeded extractor Ext. 
If Eve does change Yi, then with high probability Alice should be able to detect this and reject. 

The first goal is relatively easy to achieve. At the end of the first round, Alice and Bob compute 
Z = Ext{X,Yi) and Z' = Ext{X,Y() respectively, using a strong extractor Ext. If Eve does not 
change Yi then Z = Z' and is private and uniform. Thus in the second round Bob can authenticate 
W' to Alice by also sending a tag T' produced by a standard MAC (message authentication code) 
with Z as the key. We now focus on the second goal. If the extractor Ext in computing Z and Z' 
is non-malleable for entropy k then this can be done by using the protocol proposed by Dodis and 
Wichs [DW09]. However, we do not have explicit non-malleable extractors for k = 5n. 

Nevertheless, we will still have Alice and Bob each produce a variable V and V' respectively. 
We will ensure that, if Eve changes Yi to a different Y(, then even given T' and V', with high 
probability Eve cannot come up with the correct V for Alice. If this is true then in the second 
round we can have Bob also send V' to Alice, where Alice receives a possibly modified version V. 
Alice then checks both the tag T and whether V = V . If either of them fails, Alice rejects. This 
will give us a privacy amplification protocol. 

The first problem with the above strategy is that now V may give information about Z' , thus 
now the MAC key may not be uniform. This is easy to solve since there are constructions of MACs 
that work as long as the key has entropy rate > 1/2. Thus by limiting the size of V to be at most 
half the size of Z', we can ensure that if Eve does not change Yl, Bob can still authenticate W to 
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Alice. We now explain how we produce the variables V, V. 

We actually have Alice produce C variables V = (Vi,-'' jVc*). Similarly, Bob produces V' = 
{V(,--- ,V^)- For this, we first choose a non-malleable extractor and have Alice and Bob each 
apply the extractor to the somewhere rate-0.9 source {Xi, ■ ■ ■ ,Xc), using Yi and Y( as the seeds 
respectively. Let the outputs be {Xi, • • • ,Xc) and • • • ,X'fj). Note that one of the Xj's, say 
Xg is a rate 0.9-source. Thus we can use the non-malleable extractors in [DLWZll, CRS12, Lil2]. 
Now we fix Yi,Y(, and we have that Xg is uniform and independent of Xg. Thus we can fix Xg 
and Xg is still uniform. Next, we fix Z' . Since now Z' is a deterministic function of X, as long as 
the size of Z' is smaller than the size of Xg, conditioned on this fixing Xg still has a lot of entropy 
left. We will now have Alice extract each Vi from Xi, and correspondingly. Bob will extract each 
V- from X'-. Note that we can indeed ensure that the size of Xg is bigger than Z', while the size of 
Z' is bigger than the size of (V{, ■ ■ ■ , V^) just by limiting the size of each V- . 

Ideally, we would want V,V' he such that Vg is close to uniform conditioned onV' = {V(, ■ ■ ■ , Vq). 
However, we cannot achieve exactly this. Instead, what we can achieve is that Vg is close to uni- 
form conditioned on {V{, ■ ■ ■ , Vg). Once we have this, we can limit the size of (V^'+i' ' ' ' ' ^c) t° 
smaller than the size oiVg. Thus Vg still has a lot of entropy even conditioned onV' = {V(,--- , V^). 
This will ensure that with high probability Eve cannot come up with the correct Vg. Since we do 
not know which one of {Xi} is Xg, we will choose (Vi, • • • , Vc) such that the size of Vc is say 2s, 
and for any i the size of Vi is twice the size of T^+i. In this way, no matter what g is, the size of 
{^g+ij ■ ■ ■ ) ^c) is the size of Vg minus 2s. Thus Vg still has 2s entropy left conditioned on V . 

Finally we explain how we can achieve the above property. We achieve this by using the "look- 
ahead" extractor in [DW09] based on an alternating extraction protocol. Specifically, in the first 
round we also have Alice sample two other random strings (I2) ^3) and send them to Bob, where Bob 
receives (1^', Y^). Note that after we fix (Yi,Y(), (Y2, Y^) is a deterministic function of (I2) ^3)- Now 
pick a strong extractor Ext and Alice performs the following alternating extraction protocol: Si = 
Y3,Ri = Ext{X,Si),S2 = Ext{Y2,Ri),R2 = Ext{X, 82),- ■■ Sc = Ext{Y2, Rc^i), Rc = Ext(X,5c). 
Bob will perform the same protocol using {Y2,Y^) and produces {Sl,R'^}. As long as the size of 
{Si,Ri) is limited, this protocol has the property that for any i, (Ri, Si) is uniform and independent 
of {Sj, Rj, Sj, R'j, j < i}. We now modify this protocol such that whenever Si is used to extract 
Ri, Alice also uses it to extract Vi = Ext{Xi, Si). Correspondingly, Bob extracts V^ = Ext{X'-, S'^). 
Now one can show that as long as the size of Vi is also limited, we have that for any i, {Ri, Si) is 
uniform and independent of {Sj, Rj,Vj, Sj, R'j,Vj , j < i}. Specifically, we have that Sg is uniform 
and independent of {Sj,Rj, Vj, S'j, R'j, V-,j < g}. Moreover, we can show that conditioned on the 
fixing of {Sj, Rj,Vj, S'j, R'j,Vj , j < g}, both Sg and Sg are deterministic functions of (12)^3) and 
are thus independent of Xg. Furthermore, Xg still has a lot of entropy left. Now since Ext is a 
strong extractor, we have that Vg = Ext{Xg, Sg) is uniform conditioned on {Vj,Vj,j < g} and 
{Sg,Sg). Note that we have fixed iX'g,Z') before, while Vg = Ext{X'g,Sg) and T' is a function of 
Z' . Thus Vg is still uniform even conditioned on {{Vj,j < g},T'). Thus we have achieved our goal. 

One small problem with the above discussion is that in the first round Alice sends (Y'i,Y2,l3) 
to Bob and Bob receives {Y(, Y2, Y^). Thus Y( is a function of (li, I2, ^3)- Therefore fixing (Yi, Y{) 
may cause (12,13) to lose entropy. Thus in the alternating extraction protocol (l2;^) may not 
be uniform. However, by making the size of Yi a constant times smaller than the size of Y3, we 
can ensure that Y^ has entropy rate > 2/3 conditioned on (Yi,Y(). Thus we can add a step in 
the alternating extraction protocol: 5o = 13,-Ro = ^3z{So, X), Si = Ext{Y2, Rq) and the following 
protocol remains the same. Here Raz is the two-source extractor in [Raz05] that works as long 
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as one of the source has entropy rate > 1/2. This gives our whole privacy amphfication protocol. 
Note that the entropy loss is 0{2'^s) = 2^°^^^^^^^. For any constant 5 > this is stih 0{s). By 
using the improved non-malleable extractor in [Lil2] that has short seed length and large output 
size (In fact, it suffices to use the non-malleable condensers in [Lil2], instead of extractors), we can 
achieve randomness complexity (the number of truly random bits needed) 0{Cs) = poly(l/5)s and 
communication complexity 0(2'^s) = 2^°^^^^/^^ s. 

Organization. The rest of the paper is organized as follows. We give some preliminaries in 
Section 3. In Section 4 we show that non-malleable extractors can be used to construct two- 
source extractors. In Section 5 we show that two-source extractors based on the inner product 
function can be used to construct non-malleable extractors. In Section 6 we give our new and 
improved constructions of non-malleable extractors. In Section 7 we give our privacy amplification 
protocol for arbitrarily linear min-entropy. We conclude with some open problems in Section 8. 
The existence of generalized non-malleable extractors is proved in Appendix A and an alternative 
construction of non-malleable extractors for entropy (1/2 — 6)n is given in Appendix B. 

3 Preliminaries 

We often use capital letters for random variables and corresponding small letters for their instanti- 
ations. Let |5| denote the cardinality of the set S. Let denote the cyclic group 'L/{r'L), and let 
¥q denote the finite field of size q. All logarithms are to the base 2. 

3.1 Probability distributions 

Definition 3.1 (statistical distance). Let W and Z be two distributions on a set 5. Their statistical 
distance (variation distance) is 



We say W is e-close to Z, denoted W ~e Z, if A(l^, Z) < e. For a distribution D on a set S and 
a function /i : S* — )• T, let h{D) denote the distribution on T induced by choosing x according to D 
and outputting h{x). We often view a distribution as a function whose value at a sample point is 
the probability of that sample point. Thus \\W — Z\\£i denotes the ii norm of the difference of the 
distributions specified by the random variables W and Z, which equals 2A(W, Z). 

Definition 3.2. A function TExt : {0, 1}*^^ x {0, l}'^^ — )■ {0, 1}*" is a strong two source extractor 
for min-entropy ki, k2 and error e if for every independent (ni, ki) source X and (n2, ^2) source Y, 



A{W,Z) 



def 

= max 

TCS 



{\W{T)-Z{T)\) = IJ2\W{s)-Z{s)\. 



\{JExt{X,Y),X)-{Um,X)\<e 



and 
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3.2 Somewhere Random Sources, Extractors and Condensers 

Definition 3.3 (Somewhere Random sources). A source X = {Xi, ■ ■ ■ ,Xt) is {t x r) somewhere- 
random (SR-source for short) if each Xi takes values in {0, 1}'' and there is an i such that Xj is 
uniformly distributed. 

Definition 3.4. An elementary somewhere-k-source is a vector of sources {Xi, ■ ■ ■ such that 

some Xi is a /c-source. A somewhere /c-source is a convex combination of elementary somewhere-k- 
sources. 

Definition 3.5. A function C : {0, 1}" x {0, 1}'' {0, 1}"' is a (/c — 7- /, e)-condenser if for every 
fc-source X, C{X, Ud) is e-close to some /-source. When convenient, we call C a rate-(fe/n — )• l/m, e)- 
condenser. 

Definition 3.6. A function C : {0, 1}" x {0, 1}"' {0, 1}'" is a (A; —7- /, e)-somewhere-condenser 
if for every fc-source X, the vector iC{X,y)y^^Q iyd) is e-close to a somewhere-/-source. When 
convenient, we call C a rate-(A;/n — )• //m, e)-somewhere-condenser. 

We are going to use condensers recently constructed based on the sum-product theorem. The 
following constructions are due to Zuckerman [Zuc07]. 

Theorem 3.7 ([Zuc07]). There exists a constant a > such that for any constant < 6 < 0.9, there 
is an efficient family of rate-{5 — )• (1 + a)6,e = 2~^^'^^) -somewhere condensers Scond : {0, 1}" — ?• 
({0, 1}"')2 where m = n{n). 

Theorem 3.8 ([BKS''~05, Raz05, Zuc07]). For any constant f3,6 > 0, there is an efficient family of 
rate-{5 — )• 1 — /3,e = 2~^^^^)- somewhere condensers Cond : {0,1}" — ({0,1}"^)^ where D = 0(1) 
and m = Q{n). 



3.3 Average conditional min-entropy 

Dodis and Wichs originally defined non-malleable extractors with respect to average conditional 
min-entropy, a notion defined by Dodis, Ostrovsky, Reyzin, and Smith [DORS08]. 



Definition 3.9. The average conditional min-entropy is defined as 



H^{X\W) = -log(E^^w 



maxPr[X = x\W = w] 



log E, 



Average conditional min-entropy tends to be useful for cryptographic applications. By taking 
W to be the empty string, we see that average conditional min-entropy is at least as strong as 
min-entropy. In fact, the two are essentially equivalent, up to a small loss in parameters. 

We have the following lemmas. 

Lemma 3.10 ([DORS08]). For any s > 0, Fr^^w[Hoo{X\W = w) > HooiX\W) - s] > 1 - 2-^ 

Lemma 3.11 ([DORS08]). If a random variable B has at most 2^ possible values, then Hoo{A\B) > 
Hoo{A)-£. 

To clarify which notion of min-entropy and non-malleable extractor we mean, we use the term 
worst-case non-malleable extractor when we refer to our Definition 1.3, which is with respect to tra- 
ditional (worst-case) min-entropy, and average-case non-malleable extractor to refer to the original 
definition of Dodis and Wichs, which is with respect to average conditional min-entropy. 
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Corollary 3.12. A {k,e)- average- case non-malleable extractor is a {k,e)-worst-case non-malleable 
extractor. For any s > 0, a {k, e) -worst-case non-malleable extractor is a (/c + s,e + 2"'') -average- 
case non-malleable extractor. 

Throughout the rest of our paper, when we say non-maneable extractor, we refer to the worst- 
case non-malleable extractor of Definition 1.3. 

3.4 Fourier analysis 

We give some basic and standard facts about Fourier analysis here. We normalize as in [DLWZll]. 
For functions f,g from a set S to C, we define the inner product (/,<?) = ^^^s f^-^)9^-^^- 
D he & distribution on 5, sometimes we will also view it as a function from 5 to M. Note that 
E£)[/(D)] = (/, D). Now suppose we have functions h : S ^ T and 5 : T — > C. Then 

{goh,D) = ED[g{h{D))] = {g,h{D)). 

Let G be a finite abelian group, we say (p is a character of G if it is a homomorphism from G 
to C^. We call the character that maps all elements to 1 the trivial character. Define the Fourier 
coefficient f{4>) = {f,(l>), and let / denote the vector with entries /((/>) for all (p. Note that for a 
distribution D, one has D{(l)) = Ex)[(/)(Z))]. 

Since the characters divided by ^/\G\ form an orthonormal basis, the inner product is preserved 
up to scale: {f,g) = \G\{f,g). As a corollary, we obtain Parseval's equality: 

||7||2, = (/,/) = |GK/,/) = |G|||/||2,. 

Hence by Cauchy-Schwarz, 

11/11^1 < = < ^/\G\\\f\\i- (1) 

For functions /, g : S* — )• C, we define the function {f,g):SxS^Chy (/, g){x, y) = f{x)g{y). 
Thus, the characters of the group G x G are the functions {(j),<j)'), where (j) and <j)' range over all 
characters of G. We abbreviate the Fourier coefficient (/, by {f,g){(j),(p'). Note that 

(7;7) E fix)9iy)H^)cp'{y)=[Y,f{x)cP{x)] ( =f{m^')- 

(x,y)eGxG \x&G / \yeG J 

In this paper, in the additive group of ¥p we use the characters er{s) = e^^'^/p for r G Fp. It is 
easy to verify that {er,r G Fp} indeed are characters and these characters divided by form an 
orthonormal basis. Note that the trivial character corresponds to the case r = 0. 

We next generalize the characters to the additive group of the field ¥pi . In this case, for any 
r € ¥pi, we use the character er{s) = e^'^*^^''^)/^, where r and s are viewed as vectors in ¥p and • 
indicates the inner product function in Fj,. Again it is easy to verify that these indeed are characters 
and they form an orthonormal basis (up to a normalization factor of p'/^). 
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3.5 Non-uniform XOR lemma 

The following non-uniform XOR lemmas are proved in [DLWZll]. 

Lemma 3.13. Let (VF, W') he a random variable on GxG for a finite abelian group G, and suppose 
that for all characters ^, tp' on G with ip nontrivial, one has 

\^iw.w'MWW{W')]\<e. 

Then the distribution of {W^W) is e\G\ close to {U,W'), where U is the uniform distribution on 
G independent of W . Moreover, for f : G x G ^ R. defined as the difference of distributions 
{W,W') - {U,W'), we have \\f\\io. < e. 

Lemma 3.14. For every cyclic group G = "Ln o-nd every integer M < N, there is an efficiently 
computable function a : ^ = H such that the following holds. Let (W,W') be a random 
variable on G x G, and suppose that for all characters ip, ip' on G with ip nontrivial, one has 

\'^iw,w')[^{WW{W')]\<e. 

Then the distribution {a{W),a{W')) is 0{eMlogN + M/N)- close to the distribution {U,W') where 
U stands for the uniform distribution over H independent ofW'. 

The following non-uniform XOR lemma is proved in [CRS12]. 

Lemma 3.15. Let X be a random variable over {0, 1}™" and Y be a random variable over {0, 1}'". 
For any subset a C [m] and r C [n], let X„ = (Bi^a^i o,nd = (Bj^rYj- Assume that for any 
non-empty cr C [m] and any r C [n], we have ®Yr ~eU . Then 

|(x,y)-(t/™,y)|<((2"-i)-2")V2e. 

3.6 Strong non-malleable extractor 

The following theorem is proved in [Rao07]. 

Theorem 3.16. [RaoOlf Let TExt : {0, 1}"^ x {0, 1}"^ — > {0, 1}™ he any two source extractor for 
min-entropy ki,k2 with error e. Then if X is an (ni, fci) source and Y is an independent (nij/cg) 
source, we have 

|(TExt(X,y),y) - (c/„„y)| < 2™(2'=2-'=^+i + e). 

Here we prove a similar theorem that will enable our non-malleable extractor to be "strong" . 

Theorem 3.17. Let TExt : {0, 1}"^ x {0, 1}"^ — t- {0, 1}™ he a two source extractor for min-entropy 
ki,k2 and Ai : {0,1}"^ — )• {0, l}"^,^ = I,-- - ,r he r deterministic functions such that for any 
source X and any independent (n2,/c2) source Y, 

|(TExt(X,y),{TExt(X, A(l^))}) - iU,n,{JExt{X,MY))})\ < e. 
Then for any (n2, /cg) source Y' independent of X, 

I (TExt(X, y ), {TExt(X, A{Y'))},Y') - {U^, {TExt(X, A{Y'))},Y')\ < 2^^+^)"^ {2^^~^'^+^ + e). 
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Proof. Let W = JExt{X,Y) and Wf = JExt{X,AiiY)) for any i £ [r]. Let W be the vector 
{Wl, ■■■ , Wl). Let z be the vector (4, • • • , 4) G ({0, 1}™)''. For any (z, z) G {0, 1}'" x ({0, 1}™)^ 
define the set of bad y's for (z, z) to be 

B,-, = {y: \ Vy[W = z,W = z]-2-'^Vi[W = z]\ > e}. 
Then we must have 
Claim 3.18. For every {z,z), l^^^^l < 2 • 2*^2, 

To see this, assume for the sake of contradiction that > 2 • 2^^ for some (z, z). Let 

= {y ■■ = z,W = z]- 2-"" Pv[W = z]>e} 

and 

B-^ = {y : Pr[W = z,W = z] - 2'"" Fv[W = z] < -e}. 

Then {B^^^l = \Bfz\ + \B^z\ thus one of them must have size > 2^^. Without loss of 
generality assume that {B^^l — 2^^- Then we can let Y to be the uniform distribution over {B^^l 
and Y is independent of X, but \(W, W) — {Um, W)\ > e, which is a contradiction. 

Let B = Uz,zBz,z. We have \B\ < 2('^+i)™ • 2 • 2'^^ = 2('^+^)"^+i2'=2. Now we can bound 
|(Ty, H^,y) — {Um,W ,Y')\ when Y' is an independent (722,^2) source, as follows. 

\{W,W,Y')-{Um,W,Y')\ 

< Yl ^-''''\{W,W)\Y'=y - {U„^,W)\Y'=y\ 
yeSupp{Y') 

= Yl 2~'''^\{W,W)\Y'=y - {Um.,W)\Y'=y\ + Yl ^~''''\iW,W)\Y'=y - {U^,W)\Y'=y\ 

yeSupp{Y')nB ye5upp{Y')\B 

<2~^2 2(''+-'-)™+-'-2'^2 _l_ 2(»'+i)n^g 

_2ir+l)m ^2^2~k'2 + l _|_ 

■ 

3.7 Basic properties of the inner product function 

Here we prove some basic properties of the inner product function. 

Lemma 3.19. Let ¥p be a field and X,Y be two independent random variables over F^. Assume 
that X has min-entropy ki and Y has min-entropy /c2- Let Z = \P{X,Y) = X -Y be the inner 
product function where the operation is in ¥p. For any non-trivial character where r G Fp, 

|i^x,y[e.(^)]P<p'2-('=^+^2). 
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Proof. Note that if a weak random source W has min-entropy k, then ||VF||£oo < 2 ^, and \\W 

EJPAw = w])' < 2-^ PAW = w] = 2-K 

For a fixed y = y, 

Ex[er{x ■ y)] = Ex[ery{X)] = {ery,X) = X{ery). 

Thus 

Ex,Y[er{Z)] = EY[Ex[er{x • y)]] = Ey[J{^)] = {Y,X). 
Therefore by Cauchy-Schwartz, 



{Ex,Y[er{Z)]f < {¥,¥)■ {X,X 

11^^112 II — 



y||^2||^||£2 — ||y||£2 11^11^2 



□ 



Now for any weak random source W , we let 214^ = W + W stand for the distribution that 
is obtained by first samphng wi,W2 from two independent and identical distributions according 
to W, and then computing wi + Similarly — Vl^ is obtained by first sampling wi,'W2 and 
then computing wi — W2- Similarly we define cW to be the distribution by sampling Wi from c 
independent and identical distributions according to W, and then computing the sum. We now 
have the following lemma. 

Lemma 3.20. Let X, Y be two independent random variables over F^. For any two integers ci,C2, 
let Xci = 2'^^X — 2^^X and Yc^ = 2'^'^Y — 2'^'^Y . Then for any non-trivial character ip, 

\Ex,Y[i'{X.Y)]\ < \Ex.^y^M^c.-YcM"'^'^^"^"- 

Proof. First note 

\ExxVl>{X ■ Y)]\ = \Ey[ExVI>{X ■ Y)]]\ < Ey\ExVI>{X ■ Y)]\. 
Note that V'(s) = e^'^'^'^^l'P for some r G Fp. Thus by Jensen's inequality, 



{Exy\^{X ■ Y)\f < EyIEx^X ■ Y)f = EyIEx^X ■ Y)]ExmX ■ Y)]] 

= \Ey ^ X{xi)X{x2m(xi - X2) • Y)\ 

Xl,X2 

= \EyEx-x[>P{{X -X)-Y)]\ 
= \Ex,xmXi.Y)]\ 

where Xi = X -X. 

Apply the above procedure again, we get that 

{Ex,yVI>{x ■ Y)]f < \Ex,,y[^{Xi ■ y)]|2 < \Ex2,y[^{X2 • y)]|. 
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where X2 = - = 2X - 2X. 

Repeat the procedure for ci times, we get that 

{Ex,y[^{X ■ Y)]r'^' < \Ex^^Ai^{X,. ' Y)]\, 

where Xc^ = 2'=iX - 2^=1 X. 

similarly, we can apply the argument to Y for another C2 times, and we get 

{Ex,ymX • < \Ex^^,y^^ mXc, • Y.M 

where =2''^X - I^^X and Yc^ = 2''^Y - 2^=2 y. Thus the lemma is proved. □ 

3.8 Incidence theorems 

We need the following theorems about point line incidences. For a field F, we call a subset i G FxF 
a line if there exist a, 6 S F such that i = {(x, ax + b)} for all x S F. Let P C F x F he a set of 
points and L be a set of lines, we say that a point (x, y) has an incidence with a line £ if {x, y) S £. 
The following theorem provides a bound on the number of incidences that can be generated from 
K points and K lines. 

Theorem 3.21. [BKTO4, Kon03] There exist universal constants a > 0,0.1 > /3 > such that for 
any field Fg where q is either prime or 2^ for p prime, if L, P are sets of K lines and K points 
respectively, with K < q^~^ , the number of incidences I{P,L) < 0{K^^'^~"). 

3.9 BCH codes 

In this paper we will only focus on BCH codes over F2. Given two parameters m,t £ N, a BCH 
code is a linear code with block length n = 2*" — 1, message length roughly n — mt and distance 
d > 2t + 1. Specifically, we have the following theorem. 

Theorem 3.22. For all integers m and t there exists an explicit [n,n — mt,2t + 1]-BCH code^, 
with n = 2™ - 1. 

Since a BCH code is a linear code, we can take its parity check matrix. Note that this is a 
mt X n matrix. Let a be a primitive element in Fgm, the i'th column of the parity check matrix 
is of the form (a*, (a*)^, (a*)^, • • • , (a*)^*~^), for i = 0, 1, • • • , n — 1. Since a is a generator in Fgm, 
equivalently, for y G F2m we can think of the y'th column to be (y, y^,- " > y^^~^)- 

4 From Non-Malleable Extractors to Two-Source Extractors 

In this section we show that non-malleable extractors can be used to construct two-source extractors. 
First we have the following lemmas. 

Lemma 4.1. Let X be a probability distribution on {0,1}"^ and Y,Yi,--- ,Ym be probability dis- 
tributions on {0, 1}"2 . Assume that there exists a function f : {0, 1}"^ — t- {0, 1}"^ and positive 

■^In fact, the message length may not be exactly n — mt, but for simplicity we will assume that it is exactly n — mt. 
The small error does not affect our analysis. Also, for small t the message length is exactly n — mt. 
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numbers vi, - ■ ■ Vm with Vi = 1 such that Y = f{X) and Y = ^ ■ ViYi. Then there exist proba- 
bility distributions Xi, ■ ■ ■ ,Xm G {0, 1}"^ such that 

X = Y,ViX, andY, = f{Xi). 

i 

Proof. We define the distributions {X^} as follows. Let S be the support of Y. For any y £ S, let 
Sy = {x £ Supp(X) : f{x) = y}, i.e., Sy is the set of preimages of y. Let p{y) = Pi[Y = y] and 
yi,Pi{y) = Fic[Yi = y]. Thus we have 

P{y) = ^ViPi{y) 

i 

and 

p{y)= y Ft[X = x]. 

XGSy 

Now for any x G Supp(X), let y = f{x). Let Pr[Xi = x] = qi{x) = ^^Pr[X = x\. First note 
that 

Thus we have that Yi = f{Xi). Next note that 

E^*(^) = E E = ^Piiy) = 1- 

X y xeSy y 

Therefore for any i, Xi is indeed a probability distribution. Finally, note that 

E -^1^^-) = E ^ = = = 

Thus we have that X = fjXj. □ 

Lemma 4.2. Let X be a probability distribution on {0, 1}"^ . Assume that there exists a function 
f : {0,1}""^ —7- {0,1}"^ such that Y = f{X) is an {n2,k) -source. Then there exits an integer m 
and (fiat) (ni, k)-sources Xi, • • • ,Xm., bisections /i, • • • , /m : {0, 1}"^ — )■ {0, 1}"^, positive numbers 
vi, - ■ ■ Vm with Vi = 1 such that 

i 

Proof. First, note that every (n2, A;)-source is a convex combination of flat (n2, /i;)-sources. By 
Lemma 4.1, if y = f{X) and Y = ^iViYi, then there exist probability distributions Xi, ■ ■ ■ ,Xm 
on {0, 1}"^ such that X = ViXi and Yi = f{Xi). Thus it suffices to consider only the case where 
y is a flat (n2, A;)-source. 

Now let y be a flat (n2, A;)-source. For any y G Supp(y), let Sy = {x £ Supp(X) : f{x) = y}. 
For any x G Supp(X), let p(x) = Pr[X = x] and for any y € Supp(y), let p{y) = Pr[Y = y]. Thus 
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we have X^^-g^^^ ^(2;) = p{y) = 2~^. We now decompose X into a convex combination and produce 
the bijections /i, • " " > /m as fohows. 

Let i = 1. While DySy is not empty, do the foUowing. 

1. Pick X from DySy such that p(x) is the minimum. Assume that x € Sy. Now for all y' € 
Supp(y),y' / y, pick an arbitrary x' € Sy/. Thus for any x', p{x') > ^(x). We now let the 
source Xj be the uniform distribution over the set of the chosen x's, and we let the bijection 
fi be such that fi{x) = y and fi{x') = y' for all y' 7^ y. This clearly satisfies the property 
that f{Xi) = fi{Xi). Next, we let Vi be the total probability mass of x's, i.e., Vi = 2^p{x). 

2. We now want to subtract the probability mass from both X and Y. Thus for any x', we let 
p{x') = p{x') — p{x) and if p(x') = 0, remove x' from 5"^/. Specifically, we remove x from Sy. 
Similarly, for any y G Supp(y), let p{y) = p{y) — p[x). Note after this we still have that for 
any y € Supp(y), Y.x&SyPi^) = Piv)- 

3. Finally, let i = i + 1. 

Note that in the above algorithm, in each iteration at least one element will be removed from 
^ySy. Thus the algorithm will terminate after finite steps, and we obtain Xi, ■ ■ ■ ,Xm, /i, • • • , fm 
and vi, - ■ ■ Vm- Note that in each iteration the p(y)'s are always the same. Thus in each step we can 
always obtain a flat (ni, /c)-source Xi and a bijection fi such that fi{Xi) = f{Xi). Note that the 
algorithm terminates only when DySy is empty. Since we always have that for any y E Supp(y), 
X^xeS P(^) ~ Piy)i when the algorithm terminates we must have that for any y G Supp(y), 
p{y) = 0. Thus we have decomposed X into a convex combination of flat sources -'^i, • • • ,Xm- In 
other words, Yli Vi = I- □ 

Lemma 4.3. Let X he a probability distribution over {0, 1}"^ , Y be a probability distribution over 
{0, 1}"^ and f : {0, 1}"^ — t- {0, 1}"^ be any deterministic function. Assume that \f{X) —Y\<e for 
some < e < 1, then there exists a probability distribution X' over {0, 1}"^ such that 

\X' -X\<e andY = f{X'). 

Proof. For any y G Supp(y), let p{y) = Fr[f{X) = y] and q{y) = Pi[Y = y]. Let Sy = {x £ 
Supp(X) : /(x) = y}. Thus we have that p{y) = E^es^P^I^ = ^1- Let W = {y e Supp(y) : 
P{y) > q{y)} and V = {y € Supp(y) : p{y) < q{y)}. Thus we have that YlyeW IPiv) " = 

T^yev \piy) -Qiy)\ =e- 

We now gradually change the probability distribution X into X' , as follows. First let X' be the 
same probability distribution as X, then, while W is not empty or V is not empty, do the following. 

1. Pick y £W UV such that \p{y) - q{y)\ = min{\p{y') - q{y')\,y' GWUV}. 

2. If ?/ G W, we decrease p{y) to q{y). Specifically, let 5 = r = p{y) — q{y). We pick the 
elements x £ Sy one by one in an arbitrary order and while r > 0, do the following. Let 
t' = min(Pi[X' = x],t), Ft:[X' = x] = Ft:[X' = x] - t' and r = r - r'. Note that since 
p{y) = 5 + q{y) > 6, this process will indeed end when r = and now p{y) = q{y). Now 
to ensure that X' is still a probability distribution, we pick any y € V and increase p{y) to 
p{y) + 6. To do this, simply pick any x G Sy and let Pr[X' = x] = Fv[X' = x] + 6. Note that 
after this change we still have that p{y) < q{y). Finally, remove y from W and p{y) = q{y), 
remove y from V. 
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3. If y S V, we increase p{y) to q{y). Specifically, let 5 = q{y) — p{y). Pick any x £ Sy and 
let Pr[X' = x] = Pi[X' = x] + 5. Now to ensure that X' is still a probability distribution, 
we pick any y G W and decrease p{y) to p{y) — 6. To do this, let t = 6. We pick the 
elements x £ Sy one by one in an arbitrary order and while r > 0, do the following. Let 
r' = min(Pr[X' = x],t), Pi[X' = x] = Pi[X' = x] — t' and r = t — t' . Note that since 
Py > 6 + Qy, this process will indeed end when r = and we still have p{y) > q{y)- Finally, 
remove y from V and if = q{y), remove y from W. 

Note that in each iteration, at least one element will be removed from T4^Ul^. Thus the iteration 
will end after finite steps. When it ends, we have that Vy,p(y) = q{y)- Thus f{X') = Y. Also, it 
is clear from the algorithm that \X' — X\ = '^y^^^ \p{y) ~ q{y)\ ^ e. □ 

We also need the following definition and theorem about non-malleable extractors with weak 
random seeds. 

Definition 4.4. [DLWZll] A function nmExt : [A^] x [D] [Ad] is a (A;, A;', e)-non-malleable 
extractor if, for any source X with Hoo{X) > k, any seed Y with Hoo{Y) > k', and any function 
A : [D] — ^ [D] such that A{y) ^ y for all y, the following holds: 

(nmExt(X, Y), nmExt(X, A{Y)),Y) ([/[^j, nmExt (X, A(Y)),Y). 

A non-malleable extractor with small error will remain to be non-malleable even if the seed is 
somewhat weak random. 

Lemma 4.5. [DLWZll] A {k,e) -non-malleable extractor nmExt : {Oil}" x {0,1}'^ — > {0,1}"^ is 
also a {k, k' , e') -non-malleable extractor with e' = 2'^~^ e. 

Now we can show how non-malleable extractors can be used to construct two-source extractors. 
In [DW09], Dodis and Wichs showed that non-malleable extractors for (n, fc)-sources exist when 
when k > 2m -\- 31og(l/e) + logd + 9 and d > log(n — fc + 1) + 21og(l/e) -|- 7. We now show the 
following theorem. 

Theorem 4.6. Assume that for any e > 0, we have explicit constructions of {k,e) -non-malleable 
extractors nmExt with seed length d = 21og(l/e) -|- o(n) and output length m. Then there exists a 
constant 5 > and an explicit construction of two source extractors that take as input an (n, (1/2 — 
S)n) source and an independent {n,k) source, and output m bits with error 2~^("'). 

Proof. Let Y be an (n, (1/2 — 5)n) source and X be an independent {n,k) source. We construct 
the two-source extractor TExt as follows. First we use the somewhere-condenser in Theorem 3.7 to 
convert Y into a source Y = Scond(y)with two rows {Yi,Y2). By Theorem 3.7, Y is 2-^(")-close to 
a somewhere rate-(l -|- a)(l/2 — (5)-source. We choose 6 > such that (1 + a)(l/2 — 6) = (1/2 + 6). 
Thus now Y is 2"^(")-close to a somewhere rate-(l/2 + 5)-source. Note that each row of Y has 
/ = Q{n) bits. 

Now let Yi = {Yi o 0) and Y2 = {Y2 o 1). The two-source extractor is defined as 

TExt(X,y) = nmExt(X, Yi) nmExt(X,y2)- 

We now show that this is indeed a two-source extractor for X and Y. First note that Scond(y) 
is 2~^*'"-*-close to a somewhere rate-(l/2 + 5)-source. Lemma 4.3 implies that there exists another 
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source Y' such that \Y — Y'\ < 2"^(") and Sconcl(y) is a somewhere rate-(l/2 + J)-source. In the 
fohowing analysis we wiU treat Y as Y', and this will add at most 2"^^"^ to the error. 

Thus we now have that Y = Scond(y) is a somewhere rate-(l/2 + (5)-source, and we want 
to show that TExt(X, y) is close to uniform. Note that a somewhere rate-(l/2 + (5)-source is a 
convex combination of elementary somewhere rate-(l/2 + (5)-sources. Lemma 4.1 now implies that 
there exist sources Y^, - ■ ■ ,Y* such that y is a convex combination of y^, • • • , y* and for each y*, 
Scond(y*) is an elementary somewhere rate-(l/2 + (5)-source. Thus we only need to show that for 
each y*, TExt(X, y*) is close to uniform. Equivalently and for simplicity, we can assume without 
loss of generality that Scond(y) is an elementary somewhere rate-(l/2 + (5)-source. 

Now, again without loss of generality we assume that Yi is a (/, (1/2 + (5)/)-source. Consider 
the function f{Y) = Yi. Lemma 4.2 implies that there exist flat (n, (1/2 + 5)1) sources y^, • • • , y* 
and bijections /i, • • • , ft such that y is a convex combination of y^, • • • , y*, and for each i G [t], 
f{Y^) = /j(y*). Thus we only need to show that for each y*, TExt(X, y*) is close to uniform. 
Consider such a y*. Since f{Y^) = fi{Y^) and fi is a bijection, Y^ is a (Z,(l/2 + 5)Z)-source. 
Moreover, we can take gi to be the inverse function of fi, and now y* = gi(Y^). Note that Y2 is a 
deterministic function of y*, thus we have that now 1^* is a deterministic function of Y^. Finally, 
note that Y-^ o Y^ and 1^* f-)- Y2 are both bijections, we thus have the following claim. 

Claim 4.7. Y{ is a {I + 1, (1/2 + 5)1) source, Y2 = hi{Yl) where hi is a deterministic function, 
and My G Supp(yi*), hi{y) / y. 

In other words, Y2 can be viewed exactly as the seed modified by an adversary in a non- 
malleable extractor. Now since we are using a non-malleable extractor with seed length d = I + \ = 
21og(l/e) -|- o[n) = Q.[n) and the seed has min-entropy (1/2 -|- 5)1, by Lemma 4.5 the error of the 
non-malleable extractor is 

2(i-fc'g _ 2'+i-(i/2+5)«2(oW-'-i)/2 _ 2-(<5'"oW) — 2"f^(") 

Therefore, we have that |(nmExt(X, y/), nmExt(X, ya^) - {Um,r^mExt{X,Y^))\ < 2-^("). Thus 
TExt(X,y*) = nmExt(X,y/) nmExt(X,y2*) is 2-^W-close to uniform. So TExt(X,y) is also 
2~^(")-close to uniform. ■ 

We can generalize this theorem to work for sources with smaller min-entropy. For this we need 
(r, k, e)-non-malleable extractors with r > 1. We will prove the following theorem in Appendix A. 

Theorem 4.8. For any constant r > 1, there exists a (r,k,e) -non-malleable extractor as long as 

d> ^log(n-A;) + 31og(l/e) + 0(l) 

k>{r + l)m + ^ + 2 log(l/e) + \og{d) + 0(1). 

We can also define (r, k, e)-non-malleable extractors with weak seed. 

Definition 4.9. A function {0, 1}" x {0, 1}°' {0, 1}™ is a (r, fc, fc', e)-non-malleable extractor if, 
for any source X with Hoo{X) > k, any seed Y with H^{Y) > k' , and any r function Ai : {0, 1}'^ — t- 
{0, 1}'', i = 1, • • • , r such that Ai{y) 7^ y for all i and y, the following holds: 

{nmExt{X,Y),{nmExt{X,Ai{Y))},Y) (f/™, {nmExt(X, A(>"))}, 1^). 
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Similarly we have the following lemma. 

Lemma 4.10. A {r,k,e) -non-malleable extractor nmExt : {0,1}"' x {0,1}'^ — )■ {0,1}'" is also a 
(r, k, k' , e') -non-malleable extractor with e' = 2"^"^' e. 

Proof. For y G {0, 1}^ let Ey = A((nmExt(X, y), {nmExt(X, A{y))},y), (Um, {nmExt(X, My))},y)). 
Then for Y chosen uniformly from {0, 1}'^, 

e > A((nmExt(X, Y), {nmExt(X, MY))}, Y), {Um, {nmExt(X, MY))}, Y)) = ^ Sy. 

2/e{o,i}rf 

Thus, for Y' with H^oiY') > k', we get 

A((nmExt(X,y'),{nmExt(X,A(l^'))},>''),(f^m,{nmExt(X,A(l^'))},>"')) 

= Yl P^[Y = y]ey < 2-'^' Yl ey<2'^-''e. 

□ 

Now we have the following theorem. 

Theorem 4.11. For any constant b > 2 and any constant < 5 < 1, there exists a constant C = 
C{5) = poly(l/5) such that the following holds. Assume that for any e > there exists an explicit 
construction of (C, k, e) -non-malleable extractors nmExt with seed length d = 61og(l/e) + o(n) and 
output length m. Then there exists an explicit construction of two source extractors that take as 
input an {n,5n) source and an independent {n,k) source, and output m bits with error 2~^^"'\ 

Proof. Let Y be an (n, 5n) source and X be an independent (n, k) source. We construct the two- 
source extractor TExt as follows. First we use the somewhere-condenser in Theorem 3.8 to convert 
Y into a source Y = Cond(y) with D = poly(l/5) rows (Yi, • • • , Yd) such that Y is 2~^(")-close 
to a somewhere rate-5:^-source. Note that each row of Y has / = bits. 

Now for each row j we let Yj be Yj concatenated with the binary expression of j — 1. The 
two-source extractor is defined as 

TExt(X,y) = 0nmExt(X,yj). 

j 

Let C = D — 1. Again, we want to show that TExt{X,Y) is close to uniform. The proof is 
similar to the proof in Theorem 4.6. Specifically, we can assume that Y is such that Cond(y) is 
indeed a somewhere rate-^:^-source. This only adds 2~^^"^ to the error. Next, we can assume that 
Cond(y) is an an elementary somewhere rate- -source. 

Now, without loss of generality we assume that Yi is a (Z, ^^Z)-source. Consider the function 
f{Y) = Yi. Lemma 4.2 implies that there exist fiat {n,j^l) sources Y^,--- ,y* and bijections 
/i, • • • , ft such that y is a convex combination of Y^, • • • , y*, and for each i G [t], /(y*) = /i(y*). 
Thus we only need to show that for each y*, TExt(X, y*) is close to uniform. Consider such a y*. 
Since f{Y^) = fi{Y^) and fi is a bijection, Yf is a (/, j^l)-sowce. Moreover, we can take gi to be 
the inverse function of fi, and now y* = gi(Y^). Note that for any j,j 7^ 1, 1^* is a deterministic 
function of Y^, thus we have that now YJ is a deterministic function of Yf. Finally, note that for 
any j, YJ YJ is a bijection, we thus have the following claim. 
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Claim 4.12. Yf is a {I + 0(1), j^l) source, for any j,j ^ \, Y"- = hij{Yi) where hij is a deter- 
ministic function, and 7^ 1, Vy, hij(y) ^ y. 

Note that now we have C modified seeds for the non-malleable extractor. Since we are using a 
non-malleable extractor with seed length d = I + 0(1) = 61og(l/e) + o(n) = Q{n) and the seed has 
min-entropy jj^ifl, by Lemma 4.10 the error of the non-malleable extractor is 

2d-fc'g = 2^+o{i)~-i^i2{o{n)-i-o{i))/b < 2"(kett)'""(")) = 2-'^("). 

Therefore, we have that |(nmExt(X, y/), {nmExt(X, y/)}) - ([/,„, {nmExt(X, y/)})| < 2-^("). 
Thus TExt(X,y*) = 0^. nmExt(X,y/) is 2-^(")-close to uniform. So TExt(X,y) is also 2-^(")- 
close to uniform. ■ 

5 From Two- Source Extractors to Non-Malleable Extractors 

In this section we show how to use a certain kind of two-source extractors to construct non-malleable 
extractors. 

We first define the following encoding of a string y € {0, 1}*^. 

Definition 5.1. Given an integer s, we choose a BCH code with t = 2 and m = s + 1, thus the 
block length is n = 2*"*"^ — 1 and the parity check matrix is a mt x n matrix. For any y £ {0, 1}*, 
let Sy stand for the integer whose binary expression is y. We encode y to y such that y is the S'y'th 
column in the parity check matrix (i.e., Enc(y) = y = {y-,y^) when y is viewed as an element in 

We have the following theorem. 

Theorem 5.2. Assume that we have a two-source extractor TExt = \P{f{X),W) such that when 
given an {ni,k)-source X and an independent (71-2,^2/2 — i)- source W, TExt outputs 1 hit with 
error e. Let n'2 = [^J — 1 and let Y he the uniform distrihution over {0, 1}"2. Define a seeded 
extractor 

nmExt(X,y) = IP(/(X), Enc(y)). 
Then nmExt is a {k, e')-non-malleahle extractor with error e' = 0(2"^ + e). 

Proof. First let Y' be a source over {0, 1}"2 with min-entropy n2/2 — i -\- 1. Let A : {0, Ij^'z — )■ 
{0, Ij^s be any deterministic function such that \ly,A{y) / y. 

Note that the BCH code has distance 2t + 1 = 5 > 4, thus any 4 columns in the parity check 
matrix must be linearly independent. This in particular implies that every two different columns 
must be different. Thus Enc(y') = Y' has min-entropy 71-2/2 — Therefore by the assumption 

we have that 

nmExt(X,y') Ri, U. 

Next, note that 

nmExt(X,y') © nmExt(X,^(y')) = IP(/(X), Enc(y')) © IP(/(^), Enc(^(y'))) 

= \P{f{x),Y^ + MY^). 
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For two different yi,y2, if ^1 + = y2 + -^(2/2), then yl, ^(yi), ^2, -^(2/2) are linearly depen- 
dent. Note that A{yi) = Enc(^(yi)) and A{y2) = Enc(^(y2)) are also some columns of the parity 
check matrix. Since A{yi) 7^ yi and A{y2) / y2, we have that A{yi) / yT and A{y2) 7^ y2- Thus 
we must have A{yi) = y2 and A{y2) = yl- 

Therefore, the min-entropy of Y' + A{Y') is at least 71-2/2 — ^ since the probability of getting any 
particular element in the support is at most 2 • 2~("'2/2-£+i) _ 2~("2/2-^)_ xhus by the assumption 
we have 

nmExt(X, Y') nmExt(X, ^(y')) U. 
Thus by the non- uniform XOR lemma, Lemma 3.13, we have 

|(nmExt(X,y')>nmExt(X,^(y'))) - (f/, nmExt(X, ^(y')))l < 2e. 
Now note that Y has min-entropy n'2 = [-y-J — 1, thus by Theorem 3.17, 

|(nmExt(X,y),nmExt(X,^(y)),y) - (^7, nmExt(X, ^(F)), y)| < 22(2-(^-^) + 2e) = 0(2-^ + e). 



Similarly, we can generalize the above theorem to the case of (r, fc, e)-non-malleable extractors. 
We have the following definition and theorem. 

Definition 5.3. Given two integers r, s, we choose a BCH code with t = r + 1 and m = s + 1, 
thus the block length is n = 2*"*"^ — 1 and the parity check matrix is a mt x n matrix. For any 
y G {0, 1}*, let Sy stand for the integer whose binary expression is y. We encode y to y such that 
y is the Sy'th column in the parity check matrix (i.e., EnCr(y) = y = (y, y^, • " " 1 y'^'''^^) when y is 
viewed as an element in ¥*^^i). 

Theorem 5.4. Given two integers r,i such that £ > r. Assume that we have a two-source extractor 
TExt = \P{f{X),W) such that when given an {ni,k)-source X and an independent (n2,n2/(r + 
1) — t)-source W, TExt outputs 1 hit with error e. Let n'2 = L^qrrJ ~ 1 '^^'^ ^ uniform 
distribution over {0, 1}"2. Define a seeded extractor 

nmExt(X, Y) = IP(/(X), Enc,.(y)). 

Then nmExt is a {r^k^e') -non-malleable extractor with error e' = 0{rT'~^ + 2~e). 

Proof. First let Y' be a source over {0, 1}'"2 with min-entropy n2/{r + 1) — £ + log(r + 1). Let 
Ai : {0, 1}"2 — ). {0, 1}"2^2 = 1^ . . . ,r be r deterministic function such that for any i, \ly,Ai{y) 7^ y. 

Note that the BCH code has distance 2t + 1 = 2r + 3 > 2r + 2, thus any 2r + 2 columns in 
the parity check matrix must be linearly independent. This in particular implies that every two 
different columns must be different. Thus Enc(y) = Y' has min-entropy n2/{r + 1) — £ + log(r + 1). 
Therefore by the assumption we have that 

nmExt(X, Y') U. 

Next, choose any non-empty subset S" C [r], note that 
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nmExt(X, Y') ® nmExt(X, A(>^')) = IP(/(^), Enc(y')) © IP(/(^), Enc(A(l^'))) 

For two different ?/i,?/2, if yi+EieS-^* (2^1) = ^s+EieS (2/2), then yl, {Ai{yi),i S S},m, {Ai{y2), 
i € S} are linearly dependent. Without loss of generality we assume that all Ai{yi) are different and 
all Ai{y2) are different, since if not then some of them sum to and this only decreases the number 
of items. Note that Ai{yi) = Enc,.(Aj(2/i)) and A (2/2) = EnCr(A(y2)) are also some columns of the 
parity check matrix. Since the columns are 2r + 2-wise linearly independent, and the total number 
of items here is at most 2r + 2, we must have that the items in {y^, {Ai{yi),i G S}} and the items 
in {y2, {Ai{y2),i S S*}} form a perfect matching, where an edge in the matching is of the form 
m = Aj{y2), Ai{yi) = m or Ai{yi) = Aj{y2). 

Now we claim that for any y, there are at most r different y^'s such that y + X^jgg A(y) = 
yJ+X^jg^ A(yj)- To see this, assume for the sake of contradiction that there are r + 1 such different 
yj's. Then by the above discussion we see that for each j, there exists ani G S such that yj = Ai{y). 
Since |5| < r we must have two different yj and yi and an i £ S such that yJ = Ai{y) = yi- Note 
that EnCr is injective, thus we have yj = yi, a contradiction. 

Therefore, the min-entropy Y' + Ai(Y') is at least n2/(r+l) — ^+log(r+l) — log(r+l) = 
n2/{r + 1) — I. Therefore by the assumption we have that 

nmExt(X,y') nmExt(X, A ~e U. 

ies 

Thus by the non-uniform XOR lemma. Lemma 3.15, we have 

\{nmExt{X,Y'),{nmExt{X,Ai{Y'))}) - {U,{nmExt{X,Ai{Y'))})\ < 2^e. 
Now note that Y has min-entropy n2 = [^qrjj — 1, thus by Theorem 3.17, 

|(nmExt(X,y),{nmExt(X, - (C/, nmExt(X, {nmExt(X, A I 

< 2-+i(2-(^-i°s("+i)-3) + 25e) = 0(r2'"-^ + 2^ e). 



6 Improved Constructions of Non-Malleable Extractors 

Now we can use the above theorems to construct new non-malleable extractors. As a warm up, we 
first give a new construction of non-malleable extractors for min-entropy rate > 1/2. 

6.1 A Non-Malleable Extractor for Entropy Rate > 1/2 

For this purpose, simply notice that the inner product function itself is a two-source extractor for 
an (n, (l/2-|-5)n) source and another independent source on n bits with min-entropy slightly below 
n/2. Specifically, we have 
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Theorem 6.1. [CG88, Vaz85] For every constant 5 > 0, if X is an {n,ki) source, Y is an 
independent (n, ^2) source and ki + k2 > (1 + 5)n, then 

\P{X,Y) 

with e = 2-^('5'^). 

Thus we can use the following construction. Given an (n, /c)-source X with k = {1/2 + d)n, take 
an independent uniform seed Y G {0, 1}"/^"^ and encode y to y such that Y = Enc(y) = {Y,Y^) 
when Y is viewed as an element in F*„/2 • Our non- malleable extractor is now defined as 

nmExt(X,y) = \P{X, Enc(y)) = \P{X,Y), 
where IP is the inner product function over F2. 

Theorem 6.2. For any constant 6 > 0, the function nmExt defined as above is a ((l/2+5)n, 2~^('^") ) 
non-malleable extractor. 

Proof. By Theorem 6.1, IP(X, y) is a two-source extractor for an (n, (1/2 + S)n) source and an 
independent (n, (1/2 — 6/2)n) source with error 2~^('^"). Thus by Theorem 5.2, nmExt is a ((1/2 + 
6)n, e) non-malleable extractor with e = 0{2-^''/^ + 2-f^('5")) = 2-^('5n) _ ^ 

6.2 Non-Malleable Extractors for Entropy Rate < 1/2 

In this section we give one of our main constructions, namely a non-malleable extractor for weak 
sources with min-entropy rate 1/2 — 6 for some universal constant 5 > 0. We have the following 
construction. 

Given an (n, /c)-source X with A; = (1/2 — 5)n, we first pick a prime p that is close to n. By 
Bertrand's postulate and [BHPOl], there exists no G N such that for every n > uq, there exists a 
prime between n and n + 0{n^'^^^). We will pick a prime p in this range. Note that the prime can 
be found in polynomial time in n. Take the field Fg where q = 2^ and let g be a generator in F*. 
The construction is as follows. 

• Treat X as an element in F* and encode X such that Enc(X) = {X,g'^). 

• Take an independent and uniform seed Y £ {0, 1}^"^ and encode y to y such that Y = 
(y, y^) when y is viewed as an element in Fjp. 

• Output nmExt(X, y) = IP(Enc(X),y) where IP is the inner product function over F2. 

To prove our construction is a non-malleable extractor, we are going to use Theorem 5.2. To 
this end, we first prove the following lemma. 

Lemma 6.3. There exists a constant 5 > such that for any (n, k) -source X with k = {1/2 — 6)n, 
and any independent {2p, /C2) source Y with k2 > {1 — 5)p, 

|IP(Enc(X),y) - f/| < e, 

where e = 2~^('^) . 
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Proof. We think of X as a distribution in F* that has min-entropy k. This increases the error by 
at most (for the element 0). By the XOR lemma, we only need to show that for the only 
non-trivial character ^lJ (since we only output 1 bit), 

|i?x,r[^(IP(Enc(X),y))]| <2-^H. 
Let X' = 4Enc(X) - 4Enc(X), by Lemma 3.20 we have 

\Ex,Ym\P{Enc{X),Ym < \Ex',YmX' ■Y)]\-s . 

We next bound \Ex' yii^i^' " ^)]|- First we show that X' is close to a source with min-entropy 
rate > 1/2. We have the following claim. 

Claim 6.4. There is a universal constant 5 > such that if X is any weak source with min-entropy 
(1/2 — 5)n, 3Enc(X) is 2~^^^^ -close to a source with min-entropy (1/2 -|- 6){2p). 

Proof of the claim. Note that k = (1/2 — 6)n and p is between n and n(l -|- Thus for 

sufficiently large n we have that k > (1/2 — 1.016)p. Note that we choose the field Fg where q = 2^. 
Thus the sum of Enc{X) + Enc(X) when viewing Enc(X) as a vector in is the same as when 
viewing Enc(X) as a vector in F^. In the following we will view Enc(X) as a vector in F^. We show 
that 3Enc(X) has a larger min-entropy rate. 

First consider the distribution 2Enc(X). Note that the distribution is of the form {X -|- X, + 
Q'^). Let X = and note that g^ is a bijection in F*. Thus X has the same min-entropy as X. 
Now the support of 2Enc(X) is of the form {\ogg{xiX2),xi + X2). For any (6, a) in this support, 
we have that xiX2 = and x\ + X2 = a. Thus there are at most 2 different pairs of (xi,X2) that 
satisfy both equations. Therefore the min-entropy of 2Enc(X) is at least 2Hao{X) — 1. We can also 
assume that a 7^ since this only increases the error by at most 2"^°°^"''-^ Now let k = H^{X) — 1, 
we have that Enc(X) has min-entropy at least k and 2Enc(X) has min-entropy at least 2k. 

Now consider 3Enc(X). Every element in the support of 3Enc(X) has the form (logg(xiX2a;3), xi+ 
^2 + x^), which determines the point {xiX2X^,xi + X2 + X3). Let a = xi + X2 and h = X1X2, this 
point is 

(te3,a + X3). 

Let X3 = a + X3, then 

[a + X3, 6x3) = (^3,^3 - ah). 
For a fixed (a = xi + X2, h = X1X2) define the line 

4,b = {{x, bx - ab)\x £ ¥q}. 

Thus we have a set of lines L = {ia,b}- Note that a 7^ and 6 7^ 0. Thus for different {a,b), 
the line ia^b is also different. Note that X3 is sampled from X3, which has min-entropy k and 
(a, b) is sampled from Enc(Xi) + Enc(X2), which has min-entropy 2k. Further note that these two 
distributions are independent. Since every weak source with min-entropy /c is a convex combination 
of flat k sources, without loss of generality we can assume that X3 and Enc(Xi) + Enc(X2) are both 
flat sources. Thus L has size 2"^^. 
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Now let a, P be the two constants in Theorem 3.21. Assume that 3Enc{X) is e-far from any 
source with min-entropy (1 + a/2)2k. Since 3Enc(X) determines the distribution (A + X3, BX^), 
this distribution is also e-far from any source with min-entropy (1 + q/2)2A;. Thus there must exist 
some set M of size at most 2^^+"/^)^'^ such that 



Pr \(a + X3,bxs) £ M]>e. 

{a,b)^2Enc{X),X3^X 

Note that whenever (a + X3, 6x3) G M, this point has an incidence with the line ia,b- Further 
note that whenever (a, 6) is different or 3:3 is different, the incidence is also different. Thus by the 
above inequality the number of incidences between the set of points M and the set of lines L is at 
least 

Pr [(a + X3, bxs) G M]2''2'^'' > e2^^ . 

(a,b)<~2Enc(X),X3<-X 

On the other hand, since L has size 2^'= and M has size 2(i+"/2)2fc < 2{i+"/2)2{i/2-5)p ^ 
2(i+a/2)p ^ q^~^ , by Theorem 3.21, the number of incidences between M and L is at most 

Thus we must have e < . 

Thus we have shown that 3Enc(X) is 2~'^^I'^-q\.o's,q to having min-entropy (1 + a/2)2k. By 
choosing 5 appropriately, we get that 3Enc(X) is 2~^(")-close to having min-entropy {\/2+5)2p. □ 

Now note that y is a weak source over {0, 1}^^ with min-entropy k2> {1 — d)p. Also note that 
the min-entropy of X' is at least the min-entropy of 3Enc(X). Thus by Lemma 3.19 we have that 

lEx'xii^i^' ■y)]\< 22P2-(^/2+'^)2P2-(i-'5)p + 2-^(") = 2"^("). 

Therefore 

|i?x,y[V(IP(Enc(X),y))]| <2-^("). 

□ 

Now we can prove our construction is a non-malleable extractor. 

Theorem 6.5. For any (n,k)-source X with k = (1/2 — 6)n, the function nmExt defined above is 
a {k,e) -non-malleable extractor with e = 2~^("). 

Proof. By Lemma 6.3, IP(Enc(X),y) is a two-source extractor for an {n,k) source and an inde- 
pendent (p, (1 — 5)p) source with error 2~^("\ Therefore by Theorem 5.2, nmExt is a (A;,e)-non- 
malleable extractor with error e = 0{2~^p + 2"^(")) = 2-'^("). ■ 



6.3 Achieving Even Smaller Min-Entropy 

In this section we show that we can construct non-malleable extractors for even smaller min-entropy 
rate (potentially any constant arbitrarily close to 0), if we assume that we have affine extractors with 
large enough output size, and the Approximate Duality Conjecture (or the Polynomial Freiman- 
Ruzsa Conjecture) as in [BSZll]. 

Recall the definition of an affine extractor. 



28 



Definition 6.6. An [n, ?n, p, e] afHne extractor is a deterministic function / : {0, 1}" — )■ {0, 1}"* 
such that whenever X is the uniform distribution over some affine subspace over with dimension 
pn, we have that for every z G {0, 1}"*, 

|Pr[/(X) = z]-2— I <6. 
Now we define the duahty measure of two sets as in [BSZll]. 
Definition 6.7. [BSZll] Given two sets A,B CI Fg, their duahty measure is defined as 

The following conjecture is introduced in [BSZll] and is shown in that paper to be implied by 
the well-known Polynomial Freiman-Ruzsa Conjecture in additive combinatorics. 

Conjecture 6.8. (Approximate Duality (ADC)) [BSZll J For every pair of constants a,6 > there 
exist a constant C > and an integer r, both depending on a and 6 such that the following holds 
for sufficiently large n. IfA,B C F^ satisfy \A\,\B\ > and fi-^{A,B) > 2"^", then there exists 
a pair of subsets 

A'CA,A'>^ an4 B' c B, \B'\ > f ifiiMl V . 
- ' - 2^^+^ - u \ - y 2 J 2^"^ 

such that {A', B') = 1. 

We now have the following construction. 

Construction 6.9. Given any (n, k) source X and a constant < A < 1, let / : {0, 1}""' — )■ {0, 1}"*' 
be an [n', m' = (1 — A)|n', |, 2^™ ] affine extractor such that n = n' — m'. For any z E {0, 1}™" , 
let f-^{z) = {x : f{x) = z}. Then there exists z G {0,1}"^' such that 1/-^^)! > 2". Let F : 
{0, 1}" —7- f~^{z) be (any) injective map. Now take an independent uniform seed Y G {0, 1}" /^"^ 
and encode Y to Y such that Y = Enc{Y) = (Y, Y^) when Y is viewed as an element in F*„,y2 ■ Our 
non-malleable extractor is now defined as 

nmExt(X,y) = \P{F{X),Enc{Y)) = IP(F(X),y), 
where IP is the inner product function taken over F2. 

Remark 6.10. Note that here the function F may not be efficiently computable (in time poly(n)). 
However, the time to compute F is polynomial in the length of the truth table of our final extractor. 

Again, we will show that our construction is a non-malleable extractor by using Theorem 5.2. 
To this end, we first show the lemma. 

Lemma 6.11. For any (n, k) source X with k = j^^n and any independent (n', + 1) source Y , 
\P[F{X),Y) is non-constant. 

Proof. As usual we can assume without loss of generality that X and Y are flat sources. If 
\P{F{X),Y) is a constant, then Supp(F(X)) and Supp(y) must be contained in two affine sub- 
spaces with dimension di,d2 such that di + d2 < n' . Note that d2 > ^ since Y has min-entropy 
^ -|- 1. We next show that di > |ri' and thus reach a contradiction. 



1^^{A,B) = 
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To see this, let S = Supp{F{X)). It suffices to show that S is not contained in any affine 
1' 



subspace of dimension in'. Let A be such an affine subspace. We have 



\AnS\<\An f-\z)\ < 2 ■ 2-"*'2§"' = 2f "'+\ 

where the last inequality follows from the fact that / is an affine extractor. Now note that l^l = 
2TT2A" = 2—" . Thus we have that |A n 5| < |5| and therefore S cannot be contained in A. □ 

Now we have the following lemma. 

Lemma 6.12. There exists a constant C, = C(A) such that for any {n, k) source X with k = j^^n 
and any independent (n', ^) source Y, \P{F{X),Y) is 2-^"-dose to uniform. 

Proof. Let v = ^ = ol = min{X, ^} and 5 = Let C' c^nd r be the constant and the integer 

guaranteed by conjecture 6.8 for a and S. Let ^ = nT-i^l^, ^^}- We will prove the lemma by way 
of contradiction. 

Let X and Y be two independent sources as in the statement of the lemma. Again we assume 
without loss of generality that both X and Y are flat sources. Let A = Supp(X) and A = 
{F{a)\a e A} <Z F^'. Let B = Supp(y) C F^'. Note that F is an injective function. Thus 

\A\ = iwx"^ = 2^"' > 2""' and \B\ = 2Tr > 2^ > 2""'. 

Assume for the sake of contradiction that the error of IP(F(X), Y), which is equal to B), 
is greater than 2"''" > 2~^ . Then by the ADC conjecture (conjecture 6.8) there exist A' A 
and B' <^ B such that 

- - 7r/ 

, \A\ 5\, 2.5A LB 2 15 n' ,-, 

\^\ > -TTTT > 2~ = and LB' > ', > — - > 2-+\ 

and f^t^{A',B') = 1. 

2 5A 

Let A" be the preimages of A' under F. Since F is injective, we must have \A"\ > 2 i+2a". Thus if 
we let X' and Y' be the uniform distribution over A" and B' respectively, we get two independent 
sources that satisfy the conditions in Lemma 6.11. However \P{F{X'),Y') is a constant, which 
contradicts Lemma 6.11. Thus we must have that \P{F{X),Y) is 2~''"'-close to uniform. □ 

Now we can prove the following theorem. 

Theorem 6.13. nmExt is a (k,e) -non-malleable extractor with k = -j^^n, seed length d = 2^4\ ^ ~ 
1 anrfe = 2-^("). 

Proof. The seed length is clearly d = n' /2 — 1 = ~ ^- ^ ~ ^('^) ™ Lemma 6.12. By 

Lemma 6.12, \P{F{X),Y) is a two-source extractor for an (n, k = j^^n) source and an independent 
(n', ^) source with error 2~^". Thus by Theorem 5.2, nmExt is a (fc, e)-non-malleable extractor 
with error e = O(2-"'/30 _^ 2"^") = 2-^("). ■ 

Similarly, we can generalize our construction to (r, k, e)-non-malleable extractors. We have the 
following construction and theorem. 
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Construction 6.14. Given a constant integer r, any {n,k) source X and a constant < A < 1, 
let / : {0,1}"' ^ {O,!}'"' be an [n',m' = (1 - A)^n', 2""^'] affine extractor such that 
n = n' — m'. For any z S {0, 1}"*', let f~^(z) = {x : f{x) = z}. Then there exists z G {0, l}*"' such 
that \ f-^iz)\ > 2"^. Let F : {0,1}" ^ /"H-^) be (any) injective map. Now take an independent 
uniform seed Y G {0, l}"7(r+i)-i and encode Y toY such that Y = Enc,.(y) = (Y, Y^,--- , y2r+i) 
when y is viewed as an element in lF2n'/{r+i)- Define a seeded extractor 

nmExt(X,y) = IP(F(X),Enc,.(y)) = \P{F{X),Y), 
where IP is the inner product function taken over F2. 

Theorem 6.15. nmExt is a (r, k, e) -non-malleable extractor with k = j^:^^^^n, seed length d 
-^TTTTuTTn -1 ande = 2-^("). 

r+l+(r+l)^A 



By using Theorem 5.4 instead of Theorem 5.2, the proof of this theorem is very similar to the 
proof of Theorem 6.13. We thus omit the proof here. 



6.4 Increasing the Output Size and Reducing the Seed Length 

In this section we show that we can increase the output size and reduce the seed length for the 
constructions in Subsection 6.1, Subsection 6.2 and Subsection 6.3. All these constructions share 
the same pattern: the seed Y is encoded using the parity check matrix of a BCH code, and then 
the output is the inner product function of the encoded source and the encoded seed over F2. 

We only discuss the construction in Subsection 6.2, but the method can be applied to all the 
other constructions in the same way. We start by showing how to increase the output size to 
m = Q{n). 



6.4.1 Increasing the output size 

Recall that in the construction we used a field F2P for a prime p. Given the finite filed F2P, the 
elements of this field form a vector space of dimension p over F2. Let 61, • • • ,bp £ F2P be a basis 
for this vector space. Now recall that in the construction we encode the seed y to y = {Y,Y^), 
when viewing Y as an element in F2P. Now for each hi, let y* = {biY,biY^) and define one bit 
Zi = IP(Enc(X), y*). We now show that {Zi} satisfy the conditions of a non-uniform XOR lemma. 

Lemma 6.16. Given any {n,k)-source X with k = (1/2 — 6)n and an independent seed Y G 
{0, with min- entropy (1 - 5)p + 2, let A : {0, 1}^"^ -± {0, 1}^"^ be any deterministic function 
such that \/y,A{y) ^ y. For any i, let Z[ = \P{Enc{X),Y^'), where Y'' = {biY',biY'^) and Y' = 
A(Y). Then for any non-empty subset Si C \p] and any subset 82'^ \p], we have that 

|0z,e0zj-f/| <2-^("). 

Proof. Note that 

= IP(Enc(X), Y') = IP(Enc(X),ti(y,y3)), 

i&Si ieSi 

where ti = "^i^Si ^« ^ ^2P) and 
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IP(Enc(X), ^ yi') = \P{EnciX),t2{Y',Y'^)) 



where t2 
Thus 



0Z,©0Zj = IP(Enc(X),y) 



ieSi jeS2 



where Y = ti{Y,Y^) + t2{Y' ,Y'^). 

We now bound the min-entropy of Y and have the foUowing claim. 

Claim 6.17. HodY) > (1 - 5)p. 

Proof. We have two cases. 

Case 1: S2 = (/)■ In this case Y = y^). Since Si ^ (p, we have ti 7^ 0. Thus Y has the 
same min-entropy as Y, which is (1 — 6)p + 2 > (1 — 5)p. 

Case 2: 5*2 7^ (j). In this case we have ti 7^ and t2 7^ 0. We need to bound the min-entropy of 
Y = ti{Y,Y'^) + t2{Y' ,Y'^). Again, if for every two different 1/1,2/2, we have ti{yi,yf) + t2{y'i,y'i ) 7^ 
ii(?/2) 2/2) +^2(2/2' 2/2^)' then Y will have the same min-entropy of Y. We now show that any element 
in Supp(y) can come from at most 3 different elements in Supp(y). 

To show this, assume for the sake of contradiction that there are 4 different yi,y2, 1/3,114: such 
that ti{yi,yf) + t2{y[,yf) are the same for i = 1,2,3,4. First consider yi,y2, we have ^1(2/1, 2/1) + 
t2(2/i,2/f ) = ^1(2/2,2/2) + i2(2/2,y2^)- Since h + 0, let r = t2/ti G F2P. Thus r 7^ and we have 
iviiVi) + ^{y'liU'i) = (2/2,2/2) + '"(2/21 2/2^)- We first consider the case where r = 1. In this case, the 
vectors (2/1,2/1), (2/1, yf), (2/2,2/2) ^'^d (2/2,2/2^) linearly dependent over F2. However we know 
that the columns of the parity check matrix of the BCH code are 4-wise linearly independent. Thus 
we must have y'l = 2/2 and 2/2 = 2/i- Thus in this case the element in Supp(y) comes from at most 
2 different elements in Supp(y). Now if r 7^ 1, we have 



2/1 + ry'i 



2/2 + ry2 



and 



{y2? + r{y'2f. 



Hence we get 



2/1 - 2/2 = r{y2 - y'l) 



and 



(2/1 + 2/12/2 + 2/2) (2/1 - 2/2) = r{y'^ + 2/12/2 + 2/f )(2/2 - 2/'i)- 
Since 2/1 7^ 2/2 and r 7^ 0, we must have that y'^ 7^ 2/2- Thus we get 



2/1 + 2/12/2 + 2/2 = 2/2^ + 2/12/2 + y'l- 



Similarly we can get 
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Thus 



{yi + y2 + 2/3) (2/2 - ys) = {y'l + 2/2 + 2/3) (2/2 - y'z)- 

Also, from 2/2 + ^^2/2 = 2/3 + ^2/3 we get 



Since 2/2 7^ 2/3) we have 



Similarly we can get 



Therefore 



2/2 - 2/3 = r{y'^ - y^)- 
2/1 + 2/2 + 2/3 = -^^(2/1 + 2/2 + 2/3)- 
2/1 + 2/2 + 2/4 = -^^(2/1 + 2/2 + 2/4)- 



2/4 - 2/3 = ^^(2/3 - 2/4)- 
On the other hand, from y^, + ryg = 1/4 + r?/4 we get 

2/4 - 2/3 = lA(2/3 - 2/4)- 

Thus 

(r2-l)(y3-2/4)=0. 

bmce r 7^ 1, - 1 ^ 0. Thus we have 2/3 = 2/4) a- contradiction. 

Therefore, the min-entropy of Y is at least HaoiY) — log 3 = (1 — (5)p + 2 — log 3 > (1 — 5)p. □ 

Now, by Lemma 6.3, the lemma follows. □ 

Now we have the following theorem. 

Theorem 6.18. There exists a constant < 5 < 1 such that for any n G N, /c = (1/2 — 5)n, there 
exists an explicit {k, e) -non-malleable extractor nmExt : {0, 1}" x {0, 1}" — t- {0, 1}™ with m = il.{n) 

Proof. By Lemma 6.16 and Lemma 3.13, we can choose m = Q{n) bits from {Zi} such that when 
we have nmExt output Zi o • • • o Zm, we get 

|(nmExt(X,y),nmExt(X,^(y))) - (f/^, nmExt(X, ^(y)))| < 2-^^"). 

Note that in Lemma 6.16 the seed Y only has min-entropy (1 — i5)p+2. Thus if we use a uniform 
seed Y £ {0, 1}p-\ by Theorem 3.17 we have that 

|(nmExt(X,y),nmExt(X,^(y)),y) - (?7,„, nmExt(X, ^(y)), y)| < e, 
where e = 2^"^{2-^p+'^ + 2-'^(")) = 2-^(") when m = and is smah enough. ■ 
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6.4.2 Reducing the seed length 

In the constructions mentioned above, we use a BCH code with distance 5. Thus the cohimns of 
the parity check matrix are 4- wise hnearly independent. To reduce the seed length, we are going to 
use a BCH code with larger distance. Specifically, we will choose a [2^ - 1, 2^ - 1 - 2U, At + 1]-BCH 
code with = p/t for some parameter t to be chosen later. Note that the parity check matrix is a 
2p X (2^ — 1) matrix^. Thus the columns of the matrix are D = 4t-wise linearly independent. The 
detailed construction is as follows. 

• Given an (n, /c)-source X with k = (1/2 — 5)n, pick a prime p such that n < p < n(l + ^ ^ ^ ) . 

• Let q = 2'P and ghe a, generator in F*. Treat X as an element in F* and encode X such that 
Enc(X) = (X,<7^). 

• Let £ = p/t. Take the parity check matrix of a [2^ - 1, 2^ - 1 - 2U, At + 1]-BCH code. Note 
that it is a 2p X (2^ — 1) matrix. Take an independent and uniform seed Y S {0, and 
let Sy stand for the integer whose binary expression is Y. We encode y to y such that Y is 
the Sy'th column in the parity check matrix. 

• Output nmExt(X, y) = IP(Enc(X),y) where IP is the inner product function taken over F2. 

As in Subsection 6.2, we have Claim 6.4. We now want to argue about the min-entropy of tY 
and t{Y + A(Y)). 

Lemma 6.19. Assume Y has min-entropy k2, then tY is t^2~^^'^^^^ -close to having min-entropy 
t{k2 - logt), and t{Y + Y') is t22-(fc2+2) +t{t2-p^y°^* -close to having min-entropy t{{l - ^)fc2 - 
31ogt). 

Proof. Without loss of generality assume that y is a flat source. Let K = 2^^. First consider tY . 
Note that Y has the same min-entropy as Y and is also a flat source, since every two columns of 
the parity check matrix are different. The support of tY has the form yi + ■ ■ ■ + yt- Consider the 
case where all y^'s are different. This takes up a probability mass of 

^^■A-'=i.(i--)...(i-— )>i-|:->i--. 

Since the columns of the parity check matrix are 4t-wise linearly independent. For every two 
different sets {yi}'s, their sum cannot be the same. Therefore, the probability mass of getting a 
particular value is at most t\K~^ < 2~*('^2-iogt)_ fj^j^us tY is t'^2~^^'^~^^^ -close to having min-entropy 
t{k2-logt). 

Next consider t(Y+A(Y)). Let A{Y) = Y' and Y" = Y-^Y'. Note that for every s E Supp(y"), 
s 7^ since Vy, A{y) / y. Also note that Y" has min-entropy at least ^2 — 1 since if yi+y'i = 2/2 +^'2 
for yi 7^ 7/2; then we must have y'l = y2 and y'2 = yi- Without loss of generality assume that Y" is 
a flat source with min-entropy /c2 — 1- Let K2 = 2^'^~^ . Note that now in the support of Y" there 
are no two different 2/1,2/2 such that yi + y'l = 2/2 + y'2 (since this will be absorbed into the same 
element) . 

''Actually p is not divisible by t, thus £t < p. However for simplicity we will assume that the matrix has 2p rows. 
For example we can add O's in the end, the small error does not affect our analysis. 
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We now consider tY". An element in its support has the form We first get rid of 

those elements in Supp{tY") such that some of the {yi + y'i}'s are the same. By the same argument 
as above this takes up a probability mass of at most 27^- Now, for a particular set {yi + y'i}i£[t], 
we consider how many different sets can have the same sum. 

Since the columns of the parity check matrix are 4t-wise linearly independent, if the sum of 
two different sets {yi + y'i}i^[t] are the same, then except those yi + y'^^s that are common in 
both sets, the rest of yi + y'i^s must form cycles. By cycle we mean a set of / elements such that 
y'l = 2/2) y'2 = ^3) ■ ■ ■ ) y'l = yi so that the sum is 0. Note that / > 3 since the support of Y" has 
no 2-cycles. Let Si, S2 be the two sets {yi + y'i}ie[t]- Now, the elements in a cycle can come from 
both sets or just from one set. If the elements from a cycle comes only from S2, then this cycle can 
be replaced by any other cycle with the same length, and the sum of 5*1 and ^2 are still the same. 
On the other hand, if the elements of a cycle comes from both Si and 5*2, then the elements in this 
cycle are completely determined by Si since cycles are disjoint. Therefore, let r be the number of 
common elements in Si,S2, and let I be the total length of cycles whose elements only come from 
the rest elements of «S'2, and note that cycles have length at least 3, we have that if / > logt, then 
the total probability mass of these elements in Supp(ty") is at most 

logt<l<t ^ ^ ^ 3 ^ \ogt<l<t logt<l<t 

On the other hand, if / < logt, then the probability that tY" gets a particular value is at most 

E E C) C ; (fy^^r < tlogt- t-(f )^(K2)-' < tiogtit\K2)-^^-'-^^Y. 

0<l<\ogtO<r<t-l \ ^ / 

Thus the min-entropy is at least — ^^)fc2 — 31ogt). □ 

Now for an {n, A;)-source X with k = (1/2 — 6)n, we know that 3Enc{X) is 2~^('"^-close to having 
min-entropy (l/2 + 5)(2p). Assume that we want our non-malleable extractor to have error e < 1/n. 
We'll choose a parameter t < n/C log n for a sufficiently large constant C > 1. When Y is uniform 
over i = p/t bits, is close to having min-entropy t(A;2 — log t) > (1 — 1/C)p> (1/2 — (5/2)(2p), and 
is close to having min-entropyt((l-i^)/c2-31ogi) > (1-1/C)p > {1/2-6 /2){2p). When 
tY and t{Y + Y') indeed have this min-entropy, by Lemma 3.19 we have that both \P {3Enc{X) , tY) 
and IP(3Enc(X), t(y + y')) are 2~^(")-close to uniform. Thus we can take t = ri(n/(log(l/e))) and 
by Lemma 3.20 and Theorem 3.17 we have that the error of the non-malleable extractor is at most 
e, and the seed length is roughly p/t = 0{n/t) = 0(log(l/e)). Thus we have the following theorem. 

Theorem 6.20. There exists a universal constant 6 > such that for every n G N and e such that 
2~^{n) < g < l/poly(n), there exists an explicit {k,e) non-malleable extractor nmExt : {0,1}" x 
{0,1}*^ ^ {0,1} for k = {l/2- 6)n and seed length d = 0{logn + log{l/e)). 



35 



7 An Optimal Privacy Amplification Protocol for Arbitrarily Lin- 
ear Entropy 

In this section we present our privacy amplification protocol for (n, fc)-sources X with k = 6n for 
any constant 6 > 0. Following [KR09] and [DLWZll], we define a privacy amplification protocol 
{Pa, Pb)- The protocol is executed by two parties Alice and Bob, who share a secret X S {0, 1}". 
An active, computationally unbounded adversary Eve might have some partial information E about 
X satisfying Hoo{X\E) ^ k. Since Eve is unbounded, we can assume without loss of generality 
that she is deterministic. Informally, we want the protocol to be such that whenever a party 
(Alice or Bob) does not reject, the key R output by this party is random and independent of Eve's 
view. Moreover, if both parties do not reject, they must output the same keys Ra = Rb with 
overwhelming probability. 

More formally, we assume that Eve has full control of the communication channel between the 
two parties. This means that Eve can arbitrarily insert, delete, reorder or modify messages sent 
by Alice and Bob to each other. In particular. Eve's strategy Pe defines two correlated executions 
{Pa,Pe) and (Pe,Pb) between Alice and Eve, and Eve and Bob, called "left execution" and "right 
execution" , respectively. Alice and Bob are assumed to have fresh, private and independent random 
bits Y and W, respectively. Y and W are not known to Eve. In the protocol we use _L as a special 
symbol to indicate rejection. At the end of the left execution {Pa{X,Y), Pe{E)), Alice outputs a 
key Ra e {0, l}"" U {_L}. Similarly, Bob outputs a key Rb S {0, 1}"" U {_L} at the end of the right 
execution {Pe{E), Pb{X,W)). We let E' denote the final view of Eve, which includes E and the 
communication transcripts of both executions {Pa{X,Y), Pe{E)) and {Pe{E), Pb{X,W). We can 
now define the security of {Pa, Pb)- 

Definition 7.1. An interactive protocol {Pa, Pb), executed by Alice and Bob on a communication 
channel fully controlled by an active adversary Eve, is a {k, m, e)-privacy amplification protocol if 
it satisfies the following properties whenever Hao{X\E) > k: 

1. Correctness. If Eve is passive, then Fv[Ra = Rb A Ra t^-L A Rb t^-L] = 1. 

2. Robustness. We start by defining the notion of pre- application robustness, which states that 
even if Eve is active, Pr[RA ^ Rb /\ Ra /-L A Rb /-L] ^ e. 

The stronger notion of post- application robustness is defined similarly, except Eve is addition- 
ally given the key Ra the moment she completed the left execution {Pa,Pe), and the key 
Rb the moment she completed the right execution {Pe, Pb)- For example, if Eve completed 
the left execution before the right execution, she may try to use Ra to force Bob to output 
a different key Rb {Ra, -L}, and vice versa. 

3. Extraction. Given a string r G {0, 1}™ U{_L}, let purify(r) be _L if r =_L, and otherwise replace 
r 7^_L by a fresh m-bit random string Um- purify(r) ^ Um- Letting E' denote Eve's view of 
the protocol, we require that 

A{{Ra, E'), (purify(i?^), E')) < e and A{{Rb,E'), {punfy {Rb), E')) < e 

Namely, whenever a party does not reject, its key looks like a fresh random string to Eve. 

The quantity A; — m is called the entropy loss and the quantity log(l/e) is called the security 
parameter of the protocol. 
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7.1 Prerequisites from previous work 

One-time message authentication codes (MACs) use a shared random key to authenticate a message 
in the information-theoretic setting. 

Definition 7.2. A function family {MACr : {0,1}"^ {0,1}"} is a e-secure one-time MAC for 
messages of length d with tags of length v if for any w G {0, l}'^ and any function (adversary) 
A:{0,ir ^{OAY x{0,ir, 

Pi[MACr{W') = T' aw' \ {W',T') = A{MACr{w))] < e, 

R 

where R is the uniform distribution over the key space {0, 1}^. 

Theorem 7.3 ([KR09]). For any message length d and tag length v, there exists an efficient 
family of {\^~\2~'")- secure MACs with key length I = 2v. In particular, this MAC is e-secure when 
V = logd + log(l/e). 

More generally, this MAC also enjoys the following security guarantee, even if Eve has partial 
information E about its key R. Let {R,E) be any joint distribution. Then, for all attackers A\ and 

Pr [UkQR{W') = T' AW' \ W = Ai{E), 

{R,E) 

{W',T') = A2{MACr{W),E)] < 

(In the special case when R = U2v and independent of E, we get the original bound.) 

Remark 7.4. Note that the above theorem indicates that the MAC works even if the key R has 
average conditional min-entropy rate > 1/2. 

Sometimes it is convenient to talk about average case seeded extractors, where the source X has 
average conditional min-entropy IIoq{X\Z) > k and the output of the extractor should be uniform 
given Z as well. The following lemma is proved in [DORS08]. 

Lemma 7.5. [DORS08] For any 5 > 0, if Ext is a {k, e) extractor then it is also a (A;-|-log(l/5), e-\-6) 
average case extractor. 

For a strong seeded extractor with optimal parameters, we use the following extractor con- 
structed in [GUV09]. 

Theorem 7.6 ([GUV09]). For every constant a > 0, and all positive integers n,k and any e > 0, 
there is an explicit construction of a strong {k,e)- extractor Ext : {0, 1}" x {0, 1}*^ — t- {0, 1}™ with 
d = 0{logn + log(l/e)) and m > {1 — a)k. It is also a strong {k,e) average case extractor with 
m > (1 — a)k — 0(logn -|- log(l/e)). 

We need the following construction of strong two-source extractors in [Raz05]. 

Theorem 7.7 ([Raz05]). For any ni,n2,ki,k2,m and any < 5 < 1/2 with 

• > 6 log ni + 2 log 712 



2V-H^{R\E) 
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• ki > (0.5 + 5)ni + 3 log ni + log n2 

• k2> 51og(ni - ki) 

• m< 5 min[ni/8, A;2/40] — 1 

There is a polynomial time computable strong 2-source extractor Raz : {0, 1}"^ x {0, 1}"^ — t- 
{0, 1}™ for min-entropy ki, ^2 with error 2^^'^"^. 

Theorem 7.8. [DLWZll, CRS12, Lil2] For every constant 5 > 0, there exists a constant /3 > 
such that for every n,k ^ N with k > (1/2 + 6)n and e > 2"'^" there exists an explicit {k,e) 
non-malleable extractor with seed length d = 0{logn + loge""*^) and output length m = Q.{n). 

The following standard lemma about conditional min-entropy is implicit in [NZ96] and explicit 
in [MW97]. 

Lemma 7.9 ([MW97]). Let X and Y be random variables and let y denote the range ofY. Then 
for all e > 0, one has 



Pr 

Y 



H^{X\Y = y)> H^{X) - log |3^| - log 



> 1 - e. 



7.2 The privacy amplification protocol 

We first define the following alternating extraction protocol. 

Quentin: Q,5o Wendy: X, X ^ [Xi, ■ ■ ■ , Xt) 

■So 



51 = Extg(Q,i?o) 

52 = Extg(Q,i?i) 

St = Ext,(Q,i?t_i) 



Si 



S2 



St 



Ro = Raz{So,X) 

Ri = Ext„(X,5i), Vi = Ext„(Xi,5i) 

R2 = Ext^(X,52), V2 = Ext,(X2,52) 

Rt = ExtUX, St), Vt = Ext,,{Xt,St) 



Figure 1: Alternating Extraction. 



Alternating Extraction. Assume that we have two parties, Quentin and Wendy. Quentin 
has a source Q, Wendy has a source X and a source X = (Xi o ■ ■ ■ o Xt) with t rows. Also assume 
that Quentin has a weak source Sq with entropy rate > 1/2 (which may be correlated with Q). 
Suppose that {Q, Sq) is kept secret from Wendy and {X, X) is kept secret from Quentin. Let Extg, 
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Ext^, Exti, be strong seeded extractors with optimal parameters, such as that in Theorem 7.6. Let 
Raz be the strong two-source extractor in Theorem 7.7. Let s,d be two integer parameters for the 
protocol. The alternating extraction protocol is an interactive process between Quentin and Wendy 
that runs in t + 1 steps. 

In the O'th step, Quentin sends Sq to Wendy, Wendy computes Rq = Raz(5o, X) and replies Rq 
to Quentin, Quentin then computes Si = Extg{Q, Rq). In this step Rq, Si each outputs d bits. In the 
first step, Quentin sends Si to Wendy, Wendy computes Vi = Ext„(Xi,5i) and Ri = Extw{X, Si). 
She sends Ri to Quentin and Quentin computes ^2 = Extq{Q,Ri). In this step Vi outputs 2*~^s 
bits, and Ri, S2 each outputs d bits. In each subsequent step i, Quentin sends Si to Wendy, Wendy 
computes Vi = Exty{Xi, Si) and Ri = Extw{X, Si). She replies Ri to Quentin and Quentin computes 
Si+i = Extg{Q,Ri). In step i, Vi outputs 2*~'s bits, and Ri,Si^i each outputs d bits. Therefore, 
this process produces the following sequence: 



So,Ro = Raz(So,X),5i = Extg{Q, Ro),Vi = Ext,(Xi, Si), i?i = Ext^(X, • • • , 
St = Extg{Q,Rt_i),Vt = Ext,{Xt,St),Rt = Ext^X, St). 

Look- Ahead Extractor. Now we can define our look-ahead extractor. Let Y = (Q, Sq) be a 

seed, the look-ahead extractor is defined as 

laExt((X, X),Y) = laExt((X, X), {Q, Sq)) =Vi, • • • , V. 

Note that the look-ahead extractor can be computed by each party (Alice or Bob) alone in our 
final protocol. Now we give our protocol for privacy amplification. 



7.2.1 The protocol 

Now we give our privacy amplification protocol for the setting when Hao{X\E) = k ^ 6n. We 
assume that the error e we seek satisfies 2"^^'^"^ < e < 1/n. In the description below, it will be 
convenient to introduce an "auxiliary" security parameter s. Eventually, we will set s = log(C/e)-|- 
0(1) = log(l/e) -I- 0(1), so that 0{C)/2^ < e, for a sufficiently large 0(C) constant related to the 
number of "bad" events we will need to account for. We will need the following building blocks: 

• Let Cond : {0,1}" — )• ({0,1}"')'^ be a iate-{5 — )■ 0.9, 2~*)-somewhere-condenser. Specifically, 
we will use the one from Theorem 3.8, where C = poly(l/(5) = 0(1), n' = poly{S)n = Q{n) 
and 2"* > 2-^('5"). 

• Let nmExt : {0,1}"' x {0,1}"^' ^ {0,1}™' be a (0.8n',2-^) -non-malleable extractor. Specifi- 
cally, we will use the one from Theorem 7.8 and set the output length m' = 6 • 2^s. 

• Let Ext, Extg, Ext^„, Ext^, be seeded extractors with error 2~*, seed length d = 0(logn-|-s) and 
optimal entropy loss 0(s) as in Theorem 7.6. Extg, Extw, Ext^ will be used in laExt. 

• Let Raz be the strong two-source extractor in Theorem 7.7. This will be used in laExt. 

• Let IrMAC be a one-time ("leakage-resilient") MAC for d-bit messages, with key length 2'^(6s) 
and tag length 2*-" (3s). We will later use the second part of Theorem 7.3 to argue good security 
of this MAC even when some bits of partial information about its key is leaked to the attacker. 
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Using the above building blocks, the protocol is given in Figure 2. To emphasize the presence of 
Eve, we will use 'prime' to denote all the protocol values seen or generated by Bob; e.g., Bob picks 
W, but Alice sees potentially different W, etc. 



Alice: X 



Eve: E 



Bob: X 



{Xi,...Xc)^Cond{X). 
Sample random Y ~ (Yi, I2, X3) 
such that 

iFil =max{d,d'},\Y3\ = 'SOmax{d,d'} + 3s, 
1^2 1 = 4Cd + 3lmax{d, d'} + 4s 



iXi,...Xc) = Cond(X). 



{W,T, V) ^ {W\T\V') 



Z = Ext (X;yi)_ with 2^^(65) bits. 

X = (Xl, . . . , Xc), 

where Xi = nmExt(Xj, Fi). 

V = {Vi,...,Vc) = laExt((X, X), (Y^^Y^)) 
with parameters {2s, d). 

If IrMACz(W') or 

V y^V reject. 

Set final Ra = Ext(X;VK). 



Sample random W' with d bits. 

Z' = Ext(X;y/)_with 2<^(6s) bits. 

X' = {X[, . . . ,X'fj), 

where X^ = nmExt{Xi,Y(). 

V[ = [VI, . . . , F<A) = laExt((X, X'), {Yl Yi)) 

with parameters (2s, d). 

T' = lrMACz'(l^')- 

Set final Rb = EyX{X;W'). 



Figure 2: 2-round Privacy Amplification Protocol for Hoo{X\E) > 6n. 



Theorem 7.10. For any constant 5 > 0, the above protocol is a privacy amplification protocol with 
security parameter log(l/e), entropy loss 2^°'^^^/^^ log(l/e), randomness complexity poly(l/(5) log(l/e) 
and communication complexity 2^°^^^^^^^ log(f/e). 

Proof. The proof can be divided into two cases: whether the adversary changes Yi or not. Note 
that Yi,Y2,Y3 and W all have size 0(s). 

Case 1: The adversary does not change Yi. In this case, note that Z = Z' and is 2~''-close to 
uniform in Eve's view (even conditioned on Yi,Y2, 13). Note that the size of {V(, . . . , V^) is at most 
^^2'-^~*(2s) < 2'^(2s), and the size of Z is 2*^(65). Therefore, by Lemma 3.11 even if conditioned 
on {V{, . . . ,Vi^), the average conditional min- entropy of Z is at least 2"^ (6s) - 2"^ (2s) = 2<^(4s). 
Therefore by theorem 7.3 the probability that Eve can change W to a different W without causing 
Alice to reject is at most 

" Pis) 



22C^(3s)-2C^(4.)^2-^ <0(2-^). 
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When W = W, by theorem 7.6 Ra = Rb and is 2~^-close to uniform in Eve's view. 

Case 2: The adversary does change Yi. In this case, first note that by Theorem 3.8, Cond(X) = 
{Xi, . . . Xc) is 2~^-close to a somewhere rate-0.9-source with C rows, and each row has length Q{n). 
In the following we will simply treat it as a somewhere rate-0.9-source, since this only adds to 
the error. We assume that Xg, 1 < g < C is a rate 0.9-source ^. 

Now since the adversary changes Yi to Y( ^Yi, by Theorem 7.8 we have that 

{Xg,Xg,Yl) iU^',Xg,Yl). 

As the first step for the following analysis, we now fix Yi, Y( and Z',Xg. Note that Y( is a 
deterministic function of (Yi,y2,^), and after fixing Y(, {Z' ,X'g) is a deterministic function of X. 
Thus by Lemma 3.11 we have the following claim. 

Claim 7.11. After the fixings of {Yi,Y(, Z' , X'g), Xg is a deterministic function of X and is 
close to a source with average conditional min-entropy m' — 2^(6s). 

Note that by Lemma 3.11, after this fixing, the average conditional min-entropy of X is at least 
k — m' — 2'-^(6s). Now we analyze the sequences {Vi, . . . , Vc) produced by laExt, or equivalently, 
the alternating extraction process. Note here (Q,<S'o) = (^2,^3) and {Q' , S'q) = {Y2,Y^). First we 
have the following claim. 

Lemma 7.12. In step 0, we have 

{Rq, Sq, Sq) ~2"'' {Ud, Sq, S'q) 

and 

{Si, Rq, Sq, Rq, Sq) ~5.2-» (Ud, Rq, Sq, Rq, S'q) . 

Moreover, conditioned on {Sq,S'q), {Rq,R'q) are both deterministic functions of X; conditioned on 
{Rq, Sq, R'q, S'q), {Si,S'i) are both deterministic functions ofQ. 

Proof of the claim. Note that previously we have fixed Yi,Y(. Since Yi is independent of I3 and Y( 
is a deterministic function of Y, by Lemma 7.9 we have that So = I3 is 2~*-close to a source with 
min-entropy 29max{d,d'} + 2s. Note that I3 and X are still independent. Thus by Theorem 7.7 
we have that 

{Ro,So) ~2-» {Ud,So). 

Since conditioned on So, Ro is a deterministic function of X, which is independent of Y, we 
also have that 

{Ro,So,S'q) f«2-s {Ud,So,S'Q). 

Now we fix {So, S'q) and {Rq,R'q) are both deterministic functions of X. Note that S'q = I3 is 
independent of Y2 and S'q = Y^ is a deterministic function of Y. Thus by Lemma 7.9 we have that 
conditioned on these fixings Q = Y2 is 2~'*-close to a source with entropy ACd. Since Ro,R'q are 
both deterministic functions of X, they are independent of Q. Therefore by Theorem 7.6 we have 

''In general a somewhere rate-0.9-source is a convex combination of elementary somewhere rate-0.9-sources, but 
without loss of generality we can assume it is an elementary somewhere rate-0.9-source. 
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{Si,Ro,Rq) ^2-^ iUd,Ro,Ro)- 

Thus altogether we have that 

{Si,Ro, So, R'o, Sq) ~5.2-s (Ud, Rq, So, R'q, S'q) 
Moreover, conditioned on (Rq, Sq, R'q, S'q), (5i,5() are both deterministic functions of Q. ■ 

Now we fix [Rq, So, R'q, S'q). Note that after this fixing, <S'i,5( are both functions of Q = 12- 
Note that Q has min-entropy at least 4Cd. 

For i = 0, • • • , C, let Viewi = {Sq, ■ ■ ■ , Si, Rq, • • • , Ri, Vi, - ■ ■ ,Vi). Similarly define View'^ to be 
the corresponding variables produced by Bob. We now have the following lemma. 

Lemma 7.13. For any i, we have that 

{Ri, Viewi-i,View'i_i,Si, 5-) ~(2i+4)2-» {Ud, Viewi-i,View'^_]^, Si, 5-) 

and 

{Si+i,Viewi,View'i) W(2i+5)2-'= {Ud,Viewi,View'i). 

Moreover, conditioned on (yiewi-i,View'j^_i, Si, S^), {Ri, R'^,Vi,V-) are all deterministic functions 
of X; conditioned on {Viewi,View'j), {Si^i, S'^_^-^^) are both deterministic functions ofQ. 

Proof. We prove the lemma by induction on i. When i = 0, the statements are already proved in 
Lemma 7.12. Now we assume that the statements hold for i = j and we prove them for i = j + 1. 

We first fix {V iew j ,V iew'j) . Since now {Sjj^i,S'^j^i) are both deterministic functions of Q, 
they are independent of X. Moreover Sj is (2j + 5)2~^-close to uniform. Note that the average 
conditional min-entropy of X is at least k-m! - 2^ {Qs) - 2*^(45) - 2Cd = k-m' - 2'^(10s) - 2Cd. 
Therefore by Theorem 7.6 we have that 

{Rj+i, Viewj,View'j,Sj+i, S'j+i) ~(2j+6)2-'= {Ud, Viewj,View'j, Sj+i, S'j_^i). 

Moreover, conditioned on {Viewj, View'j, Sj^i, S'j^i), {Rj+i, R'j+i^ ^j+i^ ^j+i) determin- 
istic functions of X. 

Next, since conditioned on {Viewj, View'j, Sj^i, S'j^^), {Rj+i, R'j+i) are both deterministic func- 
tions of X, they are independent of Q. Moreover Rj+i is {2j + 6)2~*-close to uniform. Note that 
the average conditional min-entropy of Q is at least 4C(i — 2Cd = 2Cd. Therefore by Theorem 7.6 
we have that 

{Sj+2, Viewj,View'j,Sj+i,S'j^i, Rj+i, R'jj^i, Vj+i, Vj^i) 
^i2j+7)2~={Ud, Viewj,View'j,Sj+i,S'j^i, Rj+i, R'j^^, Vj+i, V^'+i)- 

Namely, 

{Sj+2, Viewj+i, View'j^^) f«(2(j+i)+5)2-'' {Ud, Viewj+i,View'j^^). 
Moreover, conditioned on {Viewj+i,View'j_^^), {Sj+2, S'j_^^2) deterministic functions of Q. ■ 
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Now we consider step g in the alternating extraction protocol. By Claim 7.11, conditioned 
on {Yi,Y{, Z' , Xg), Xg is a deterministic function of X and is close to a source with average 
conditional min-entropy m' — 2^ {6s). Since m' = when s is significantly smaller than 

5n (but we can still achieve up to s = Q,{6n)), we have that 7n' > 2'-^(12s) + 2Cd and thus 
m' -2^{6s) >2^{6s) + 2Cd. 

We now have the following lemma. 

Lemma 7.14. Conditioned on the fixing of {VieWg^i,View'g_i), Xg is a deterministic function of 
X, and the average conditional min-entropy of Xg is at least 2'^(2s). 

Proof. We first use induction to prove that for any i, conditioned on the fixing of {yiewi,View[), 
Xg is a deterministic function of X. When i = 0, we first fix {Sq,Sq), which is independent of 
Xg. After this fixing {Rq,Rq) is a deterministic function of X. Thus we can now fix {Rq,Rq) and 
conditioned on this fixing, Xg is still a deterministic function of X. Thus the statement holds for 
i = 0. 

Now assume that the statement holds for i = j, we show that it also holds for i = j + 1. 
Specifically, when (Viewj^VieWj) is fixed, {Sj+i, Sj_^^) is a deterministic function of Q, which is 
independent of X. Thus we can fix S'j^i). Now after this fixing, {Rj+i, R'j+i, ^j+i-, ^j+i) a 

deterministic function of X. Thus we can now fix (iJj+i, R'j j^i, ^j+i^ ^'^^ conditioned on this 

fixing, Xg is still a deterministic function of X. Thus the statement holds for i = j + 1. Therefore, 
for any i, conditioned on the fixing of {Viewi, View'^), Xg is a deterministic function of X. 

Finally, note that only the fixings of {Rj, R'j,Vj,Vj) can cause Xg to lose entropy. Since the 
total size of {iRj,R'j, Vj,V^)} is at most Ylf=i 2^-'{4:s) + 2Cd < 2'^ (As) + 2Cd, by Lemma 3.11 the 
average conditional min-entropy of Xg is at least 2'^(2s). ■ 

Now by Lemma 7.13, conditioned on the fixing of {VieWg-i,View'g_i), Sg ~(2g+3)2-s Ud and 
Sg,S'g are both deterministic functions of Q, which is independent of X. Thus Sg and Xg are 
independent. Therefore by Theorem 7.6 we have 

(Vg, Sg, Sg) ^2-= {U2C-a{2s)i Sg, Sg). 

Adding back all the errors, and note that we have fixed (Yi, Y-[, Z' , Xg) and (VieWg-i, View'g_i), 
we have that 

{Vg, Sg, Sg, VieWg_i,View'g_i,Z' , Xg) ^o{C2-=) (f^2c--s(2s), ^g, Sg, VieWg_i,View'g^i, Z' , Xg). 

In particular, note that Vg = Exty{Xg, Sg) and for a fixed message w' , T' = \rMACz'{w') is a 
function of Z'. Thus we have that 

{Vg,View'g^^,T',V^) ~o(C2--) iU2C - a (^2s) ^ V iew'g_-^, T' ,V^). 
This implies that 

(Vg, T', V{, ■■■ , Vg) ~o(C2-'=) (f^2C-9(2s), V(, ■■■ , Vg). 

Now note the size of {V^^^, • • • , F^^) is at most Y.f=g+i 2^~'(2s) = 2'^~9{2s) - 2s, and that Vg 
has size 2'-'~^(2s). Therefore, if Vg is uniform conditioned on (T', V(, ■ ■ ■ , Vg), then by Lemma 7.9 
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we have that with probabiUty 1 — 2~* over the fixings of (T', V(, • • • , V^), is a source with min- 
entropy s. Thus the probabihty that Eve can come up with the correct Vg is at most 2-2"'^. Adding 
back the error, we have that in the case that Eve changes Yi, the probability that Ahce does not 
reject is at most 0(C2~^). For an appropriately chosen s = log(l/e) + 0(1) this is at most e. 

Finally, note that the entropy loss of the protocol is 0{2'^s) = 2P°^y(i/'^) log(l/e) = 0(log(l/e)), 
the randomness complexity is 0{Cd + s) = 0{Cs) = poly(l/5) log(l/e) = 0(log(l/e)) and the 
communication complexity is 0(2*-^s) = 2^'^^^^^/^^ log(l/e) = 0(log(l/e)). ■ 

8 Conclusions and Open Problems 

In this paper we present a strong connection between non-malleable extractors and two-source 
extractors. First, we show that non-malleable extractors can be used to construct two-source 
extractors. If the non- malleable extractor works for small min-entropy and has a short seed length 
with respect to the error, then the resulted two-source extractor beats the best known construction 
of two-source extractors. Second, we show that two-source extractors of the form \P{f{X),Y) can 
be used to construct non-malleable extractors. Using this connection, we give the first explicit 
constructions of non-malleable extractors for min-entropy k < n/2. 

The most important message from this part is perhaps that, non-malleable extractors and two- 
source extractors, although seemingly different, are closely related. Thus, future research should 
probably consider these two kinds of extractors together, as improvements in one kind may lead 
to improvements in the other. Our connection also suggests that it may be hard to construct 
non-malleable extractors for small entropy. However, strictly speaking, our result only shows that 
it may be hard to construct non-malleable extractors for small entropy with short seed length with 
respect to the error. It is totally possible that we can get explicit non-malleable extractors for 
small entropy with large seed length. Moreover, the weaker notion of non-malleable condensers 
introduced in [Lil2] is a hopeful alternative. 

We also give the first privacy amplification protocol for k = 6n that simultaneously achieves 
optimal round complexity (2 rounds), asymptotically optimal entropy loss and communication 
complexity. However, our entropy loss is 2"^"^^^^^^^ s, which has a large hidden constant for small 
6. As a comparison, the protocol in [DLWZll] runs in poly(l/(5) rounds but only has entropy loss 
poly(l/(5)s. Thus for practical purposes it is interesting to see if we can reduce the hidden constant. 
In particular, it remains an interesting open problem to construct non-malleable extractors or non- 
malleable condensers for arbitrarily linear min-entropy. 
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A The Existence of Generalized Non-Malleable Extractors 

In this section we prove the existence of non-malleable extractors with more than one adversarial 
seeds. First, we have the following definition. 

Definition A.l. A function nmExt : {0, 1}" x {0, l}'^ — {0, 1}'" is a (r, k, e)-non-malleable extrac- 
tor if, for any source X with H^{X) > k and any r function Ai : {0, l}*^ — t- {0, 1}*^,^ = 1, • • • ,r 
such that Ai{y) ^ y for all i and y, the following holds. When Y is chosen uniformly from {0, l}'^ 
and independent of X, 

(nmExt(X,y),{nmExt(X,A(y))},y) ([/„, {nmExt(X, A(^))}, >^). 

We will prove the following theorem. 

Theorem A. 2. For any constant r > 1, there exists a (r, k, e) -non-malleable extractor as long as 

d>^ log(n -k) + 3 log(l/e) + 0(1) 

k>{r + l)m + ^ + 2 log(l/e) + log{d) + 0(1) 

We prove the theorem by using the probabilistic method, similar to the existence proof in 
[DW09]. A function / : {0, 1}*^ x {0, l}*^ — ^ {0,1}'" is a (r, /c, e)-non-malleable extractor if for all 
(n, k) sources X, all adversarial functions {Ai} and all distinguishers P, we have 
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\Pr[V{f{X,Y),{f{X,MY))},Y) = 1] - Pr[P(C/„, {/(X, Al^)}, ^ = l]l < e- 
As usual, it suffices to consider flat sources X. For the purpose of a union bound, we fix 
some V, {Ai} and a source X which is uniformly distributed on some subset Supp(X) C {0, 1}" 
with |Supp(X)| = 2^=. We use F to denote the uniform distribution over the space of all functions 
/:{0,ir x{0,l}'^^{0,ir. 

For each u € {0, 1}™ and y G {0, l}'^, define 

Count{u,y) = \{u2 G {0, 1}™ : P(n2, n, y) = 1}|. 

For each x G Supp(X), y G {0, l}'^, define the following random variables (where the randomness 
comes from F). 

L(x, y) = V{F{x, y), {F(x, Aiy))},y) 
R(2: y) = Count( {F(x,A(?/))},j/) 



and let 



2m 

Q(x,y) = L{x,y) - Ii{x,y) 



Q 



2k+d 

Thus, Q is essentially the quantity 

Pif) = PT [V{f{X,Y),{f{X,MY))},Y) = 1] - FvjDiU^,{f{X,MYm,Y) = 1]. 
Therefore, we want to upper bound 

Pr[|Q| > e] = Pr > e]. 

Again, it is easy to notice that for any E(L(x,y)) = E(R(x,y)) and thus E(Q(x,y)) = 

and E(Q) = 0. However, the variables Q(x, y) are not necessarily independent of each other (in 
particular, the adversarial seeds can form cycles), so we cannot use a simple Chernoff bound here. 
Now we represent the functions {Ai} as a directed graph G = {V, E) where the vertex set is 
V = {0, 1}'' and there is an edge from y to y' iff 3z, A(y) = v' ■ Note that each Ai is a function, 
thus the out-degree of every vertex is exactly r. The following lemma is proved in [CRS12]. 

Lemma A. 3. Let G = {V, E) he a directed graph without self-loops. Assume that the out-degree of 
each vertex is at most r, where parallel edges are allowed. Then, there exists a subset of the vertices 
V' C V, such that the induced graph H = (V',E') of G is acyclic, and \V'\ > |l^|/(?' + !)• 

Let s = l/(r-|-l). We now use Lemma A. 3 to decompose G into t+l subgraphs Hj as follows. In 
each step j, 1 < j < t, we use the lemma to pick a s fraction of vertices from the remaining vertices 
to form a subset Vj and an induced graph Hj, and delete these vertices from G. After t steps the 
remaining graph is Ht+i- Thus we have that \Vj\ = s{l — sy~^\V\ for j <t and |V4+i| = (1 — 

The following lemma is proved in [DW09]. 
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Lemma A. 4. For V' C V , let H he the restriction of G to the vertices V and assume that the 
graph H is acyclic. Then the set {Q{x,y)xeSupp{x),yeV'} of random variables can be enumerated 
byQi,---,Qeforl = \V'\2'' such that E[Qi\Q^, ■■■ , Q^^J = for all 1 < i < I. 

We can now prove the theorem. 

Proof of Theorem A. 2. By Lemma A. 3 and Lemma A. 4, we can partition {Q(x, y)} into t (enumer- 
ated) sets {Qj, •• • ,Qj,}where/j = 1 1^- 1 2^^' for j = !,••• , f and a remaining set {Q(a;, y)^gSupp(X),?/gVi+i}- 
For each j < t, 1 < i < /j, we have that E[Q;?|Q-J, • • • , Q:^„x] ~ ^- Now for each j < t, 1 < i < Z^, 
define S\ = "^^^i Qj ■ Then for any j < t, S^, ■ ■ ■ , Sj, is a martingale. 

We first show that if for any j < t, \Si] < ^Ij and |Vt+i| < |2"', then |Q| < e. Indeed, note 
that for any Q{x,y) < 2. Thus in this case we have that 



IQI 



J2j=l 5*/^ + Ylxi=5upp{X),yeVt+i Qi'/y 



2k+d 



- ok+d I X] 



+ 



a::eSupp(X),3/GVt+i 



< 



< 



2k+d 

1 ( ^rid+k I £r,a!+fe 

2k+d [2 ^ 2 



Next, by Azuma's inequahty we have that for any j < t, 



t-lnd+k ^2 



Pr[|5/,| > -I,] < e-^'^^' < e-^^(i-^)"'2' 

since Ij = \ Vj\2^ > s{l - s)*-i2^+'=. 
Therefore, by the union bound, 

Pr[3j < t, \Si\ > -Ij] < te-M^-^y-'^'^'^'. 

Now we will apply the union bound to ah possible X,{Ai},V. Let N = 2'',K = 2^,D = 
2'^,M = 2"*. Then there are (^) possible sources X, there are D^^ possible adversaries {Ai} and 
there are 2^^^"^' possible distinguishers V : {0, l}" x ({0, 1}™)'' x {0, l}*^ {0, 1}. Thus to ensure 
that there exists a function that is a (r, k, e)-non-malleable extractor, all we need is to satisfy the 
following inequalities: 



and 



Wt 



t+l\ 



[I - s)*2'^ < J2^. 



Choose t such that (1 — s)* = 2 3. Now it is easy to check that both of these conditions are 
satisfied when the statements in the theorem hold. ■ 
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B Another Construction of Non-Malleable Extractors for Entropy 
Rate 1/2 - (5 

Here we give another construction of non-maUeable extractors for (n, k) sources with k = (1/2 — (5)n 
for some constant 6 > 0. 

Given an (n, A:)-source X with k = (1/2 — 5)n, we first pick a prime p such that 2" < p < 2"'*'^. 
By Bertrand's postulate, there is always such a prime. Now treat X as an element in the field ¥p. 
Next we take an independent and uniform seed Y £ {0, 1}" and again treat Y as an element in ¥p. 
Encode X,Y such that Enc(X) = {X,X^) and Enc(y) = iY,Y^). The operations are in ¥p. Our 
non-malleable extractor is defined as 

nmExt(X,y) = IP(Enc(X), Enc(y)) mod M 

for some integer M = 2™ that we will choose later. Note that Enc(X) and Enc(y) are vectors in 
and IP is the inner product function taken over ¥„. 

Again, we show that for any weak source X with min-entropy (1/2 — 5)n, 3Enc(X) is close to 
a weak source that has min-entropy (1/2 + 5) log(p^). 

Lemma B.l. Let F = Fp for p prime and X be a random variable over F. There is a universal 
constant S > such that if X is any weak source with min-entropy (1/2 — S)n, 3Enc(X) is 
close to a source with min-entropy (1/2 + 5) log(p^). 

Proof. Note that X has min-entropy (1 /2 — 5)n > (1/2 — 5) log p—1. First consider the distribution 
2Enc(X). Note that the distribution is of the form {X + X,X'^ + X"^). For any (a, b) in the support 
of 2Enc(X), we have that a = xi + X2 and b = xf + Thus there are at most 2 different pairs of 
(xi, X2) that satisfy both equations. Therefore the min-entropy of 2Enc(X) is at least 2H^{X) — 1. 
Now let k = Hoo{X) — 1, we have that Er\c{X) has min-entropy at least k and 2Enc(X) has 
min-entropy at least 2k. We now have the following claim. 

Claim B.2. Let a, (3 be the two constants in Theorem 3.21. Then 3Enc(X) is 2~^^^^ -close to a 
source with min-entropy (1 + a/2)2k. 

Proof of the claim. Note that an element in the support of 3Enc(X) has the form (xi +X2 + X3, x^ + 
x| + X3). This determines the point 

(Xi +X2+ X-s, (Xi +X2 + Xs)^ - {xj +xl + x|)) 
= ((xi + X2) + X3, 2(xi + X2)X3 + (Xi + X2)^ - {xl + xl)) 

Let a = Xl + X2 and 6 = xf + x^, this point is 

(a + X3, 20x3 + — 6). 

Let X3 = o + X3, then 

[a + X3, 2ax3 + — 6) = (X3, 20x3 — — b). 
For a fixed (a = xi + X2, 6 = xf + x^) define the line 
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4,6 = {{x,2ax - a-^ -b)\x G F}. 

Note that for different {a,b), the hne 4,6 is also different. Thus we have a set of hnes L = 
{ia,b}- Note that is sampled from X3, which has min-entropy k and {a,b) is sampled from 
Enc(Xi) + Enc(X2), which has min-entropy 2k. Further note that these two distributions are 
independent. Since every weak source with min-entropy A; is a convex combination of flat k sources, 
without loss of generality we can assume that X3 and Enc(Xi) + Enc(X2) are both flat sources. 
Thus L has size 2"^^. 

Now assume that 3Enc(X) is e-far from any source with min-entropy (l+a/2)2/c. Since 3Enc(X) 
determines the distribution (^ + ^3, 2AX^ -\- — B), this distribution is also e-far from any source 
with min-entropy (1 + a/2)2k. Thus there must exist some set M of size at most 2^^"'""/^)^'^ such 
that 

Pr \{a + X3, 2ax3 + 0? -b)£ M] > e. 

{a,6)<-2Enc{X),X3^X 

Note that whenever (a + X3, 20x3 + — 6) G M, this point has an incidence with the line £afi- 
Further note that whenever (a, b) is different or X3 is different, the incidence is also different. Thus 
by the above inequality the number of incidences between the set of points M and the set of lines 
L is at least 

Pr [(a + X3, 2ax3 + - b) e M]2^2^^ > e2^^ . 

(a,6)^2Enc{X),X3^X 

On the other hand, since L has size 2^^ and M has size 2(^+^/2)2^ < 2{i+"/2)2(i/2-5) logp ^ 
2(i+a/2)iogp ^ p'^~^ , by Theorem 3.21, the number of incidences between M and L is at most 

Q|-2(3/2-Q!)(2+a)fc^ ^ 23k{l-a/6) _ 2-ofc/223fc 

Thus we must have e < 2""'^/^. □ 

By choosing 6 appropriately and noting that k > (1/2 — 6) logp — 2, the lemma is proved. □ 

Now we can use the non-uniform XOR lemma to argue that our extractor is non-malleable. 
Specifically, we have the following lemma. 

Lemma B.3. Let 6 be the constant in Lemma B.l. Given any {n, k)-source X with k = (1/2 — 5)n, 
and Y an independent source over {0, 1}" with min-entropy (1 — 5)n, let W = IP(Enc(X), Enc(y)) 
and W = IP(Enc(X),Enc(y')) where Y' = A{Y) and Vy G {0,1}", ^(y) / y. For any two 
characters -0(5) = e^'^^^'^l'P and i\)\s) = e^'^**'''/^ where t,t' G ¥p and t j^O, 

\Ew,w'MW)^'iW)]\ < 2-^("). 
Proof. Note that W,W' are deterministic functions of X,Y. Thus 

Ew,w'mW)^P'{W')] = Ex,Y[i^{W)^P'{W')]. 

Depending on whether ip' is trivial, we have two cases. 

Case 1: t' = 0. This corresponds to the case where ip' is the trivial character. In this case 
tp'{W') is always 1. Thus 
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EwM^i^Wi^')] = Ex,y[^{W)] = i?x,y [^(Enc(X) • Enc(y))]. 

Note that Enc(y) has the same min-entropy as Y , which is (1 — 5)n. Now consider Enc(X). 
Since X has min-entropy (1/2 — 5)n, by Lemma B.l 3Enc(X) is p~^(^)-close to having min-entropy 
(1 /2 + 5) log(p^). Now note that the min-entropy of 4Enc(X) — 4Enc(X) is at least the min-entropy 
of 4Enc(X), and which in turn is at least the min-entropy of 3Enc(X). Thus 4Enc(X) — 4Enc(X) 
is p~^(^)-close to having min-entropy (1/2 + 5) log(p^). Since (1/2 + 5) log(p^) + (1 — 5)n > (1 + 
28) logp + (1 — 6){\ogp — 1) > (2 + 5) logp — 1, by Lemma 3.20 we have 

\Ew,w'[^{W)i:'{W')]\ = |^x,y[V'(Enc(X) • Enc(y))]| < (/2i-(2+5) iogp)i/i6 ^^-f7(i) ^ ^-nin) _ 
Case 2: t' ^ 0. This corresponds to the case where ij^' is non-trivial. In this case, note that 

i){W)lp'{W') = e2'^»*(Enc(X).Enc{y))g27rii'(Enc(X)-Enc{y')) = g27rit{Enc(X)-{Enc{y)+rEnc{y')) ^ 

where r = t' /t £ ¥p and r 7^ since t ^ and t' ^ 0. 
Let Enc(Y) = Enc(Y) + rEnc(y'), then 

Ew,w'mW)iP'{W')] = Ex,y[HW)^P'{W')] = ^x,y[^^(Enc(X) • e'^^^))]. 

Now again by the same argument as above we have that 4Enc(X) — 4Enc(X) is p~^(^^-close to 

having min-entropy (1/2 + 6) log(p^). Now we only need to bound the min-entropy of Enc(y). 
If for every two different yi,y2, we have that Enc(yi) + rEnc(y^) 7^ Enc(2/2) + ^Enc(y2), then 

obviously Enc(y) will have the same min-entropy as Y. Now assume that for some two different 
yi,y2, we have Enc(yi) + rEnc{y[) = Enc(?/2) +rEnc(y^). 
This gives us 

yi + ry'i = y2 + ry'2 

and 

iyif + riy[)' = iy2? + riy'2?. 

Hence we get 

yi-y2 = r{y2 - y'l) 

and 

{yi + y2){yi - 2/2) = r{y2 + y'i){y2 - y'l)- 

Since yi ^ y2 and r 7^ 0, we must have that y'l 7^ ^2- Thus we get 

yi + y2 = y2 + y'l- 

Therefore we can completely solve the equations and get 
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y[ = ((r + 1)^2 + {r- l)yi)/2r, y'^ = ((r + l)yi + (r - l)y2)/2r. 



Thus any element in Supp(Enc(y)) can come from at most 2 elements in Supp(y). To see 
this, assume for the sake of contradiction that we have Enc(?/i) + rEnc(y^) = Enc(y2) + ^Enc(y2) = 
Enc{ys) + rEnc(y3) for three different 1/1,2/2,2/3- Thus by above we have 

y[ = ((r + 1)2/2 + (r-l)2/i)/2r 

and 

2/; = ((r + l)2/3 + (r-l)2/i)/2r. 

Note that r 7^ — 1 since otherwise this would imply that y[ = 2/1 which contradicts the assump- 
tion that \/y,A{y) / y. Thus we get 2/2 = 2/3; another contradiction. 

Therefore the min-entropy of Enc(y) is at least H^iY) — 1 = (1 — 5)n — 1. Now since (1/2 + 
(5)log(p2) + (l-5)n-l > (l + 2(5)logp+(l-5)(logj9-l)-l > (2 + 5) logp - 2, by Lemma 3.20 
we have 



\Ew,w'[^{WW{W')]\ = |i?x,y[V'(Enc(X) • Enc(y))]| < (p222-(2+5) iogp)i/i6 ^ ^ ^-n[n) _ 

□ 

Now we can prove the following theorem. 

Theorem B.4. Let 5 he the constant from Lemma B.l. Given any {n,k) source X with k = 
(1/2 — 5)n and an independent uniform seed Y G {0,1}", as well as any deterministic function 
A : {0, 1}" {0, 1}" such that \/y,A{y) / y, 

|(nmExt(X,y),nmExt(X,^(y)),y) - (?7^, nmExt(X, ^(y)), y)| < e, 
where e = 2~^("') and output size m = 0(n). 

Proof. Let Z = nmExt(X, y) and Z' = nmExt(X, ^(y)). By Lemma B.3 and Lemma 3.14, we can 
choose an m = Q{n) and M = 2"^ such that when nmExt(X, Y) = IP(Enc(X), Enc(y)) mod M and 
y is an (n, (1 — 5)n) source independent of X, we have 

\{Z,Z')-{U^,Z')\<e', 

where e' = 0(n2'^2-^(") + 2"^"") = 2-^("). 

Therefore when Y is an independent uniform distribution over {0, 1}", by Theorem 3.17 we 
have 

\{Z,Z',Y)-{U^,Z',Y)\<e, 

where e = 22"^(2i-'5" + e'). 

Note that e' = 0(n2™2-^(") + 2"^-"). Thus we can take m = 0(n) and e = 22'»(2i-'^" + e') = 
2-f^(n)_ 'j^j^^g theorem is proved. ■ 
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